I just imported and epub and it breaks down the file into hundreds of paragraphs. I want to delete several hundreds of these paragraphs as they’re in English. The one advantage is that these paragraphs cluster together. Is there an easy way to delete them without having to do it one by one? Am I better off learning how to edit an epub? Thanks,
Would be good to learn Calibre i feel, very useful software if you have a lot of books to manage.
I suspect the english paragraphs are using different font or html style ? (ePuB are basically zipped up html files). You can probably search by the font name or html style using the editor in Calibre.
Calibre has very advanced (regex) search and replace features. You can probably set it up to do everything automatically once you identify the pattern to search for.
if you have access to chatGPT you can ask chatGPT to create the regex search expression.
Just a demo… I found a book with interlinear translations that I tried to remove using regex. It uses a specific html style for the english translation.
I have no idea how regex works so I ask chatGPT to give me expression to use.
It would be MUCH easier to copy the 2nd language text from the epub and then paste it into word for editing or just paste into the manual lesson import field.
I’d like to give a more detailed explanation. As stated earlier, the workflow depends on how the respective ebook is set up. But an example may give you an idea.
In the example case I use a language learning textbook as an example. The way the book is set up is that you have a short text in the target language, in this case Korean, followed by a translation in English as well as a short vocabulary list. I assume this is relatively close to what you have.
On the main screen, you can find a button in the upper right corner that allows you to edit the ebook. Clicking on it (with the book you want to edit selected) gets you to the following screen.
I collapsed the menus on the right to give you an overview. As you can see an ebook consists out of several files.
text: contains html files that represents the actual text. In the ebooks I had thus far one html file usually represents one chapter of a book.
styles: this contains stylesheets that handle the font and other style-related aspects for the whole document. This is useful if you want to create a copy of the ebook with adjustments, like a bigger script or a different, easier to read font.
images: (here called Bilder, which is the German word for it) contains images used in the book, like the title page.
fonts: (here called Schriftarten) contains fonts used in the document.
different: (here called Verschiedenes) additional stuff needed. It usually contains the file that handles how the different files are ordered to form the ebook.
By double-clicking on one of the html files under Text you can open said file for editing and get a preview.
In my example case, the Korean text and the translation+vocabulary list form one chapter each and are put into seperate files. So the first html files are the title page, the introductory words and the table of contents. After that there is one file containing the Korean text followed by the next file containing the translation+vocab list followed by the next file containing the Korean text of the next chapter and so on. So in my case I would just have to delete all the files that don’t contain Korean text and save the result. After that you can export the modified ebook from the main menu. For this book containing 50 stories the whole process would be done in about 2 minutes.
So it’s really not a difficult thing to do, but the overall amount of work needed may differ depending on how the ebook is setup. Note that when importing an ebook into Calibre, a copy is created the programm then works with. So you cannot mess up your original book and if something went wrong, you can just delete and reimport the book.
Just to potentially state the obvious, unless I’m misunderstanding the question. In the edit lesson page, instead of having to delete line by line paragraph by paragraph, you can click on “regenerate lesson” on the left hand side, and it will make it so you can edit all the text at once. so in this case you can select all of the “clusters” of English paragraphs and delete them, then save.