Improve book reading experience: keep html ebook formatting and increase max lesson size


I use LingQ mainly for reading books in foreign languages. LingQ in this regard is obviously better than an e-reader (such as a a kindle device), because the whole system of lingqing is just non-existent on e-readers. So LingQ obviously is better than pure reading when it comes to language learning.

However there are two ways in which an e-reader is still superior to LingQ.

  • the formatting of the content

  • the navigation in the book (between different chapters and so on)

When it comes to the first point (formatting) there simply is none. On LingQ the text is just separated into sentences and written in one style. There are no headlines, paragraphs or specially emphasized words. Every sentence starts a new line and all words look exactly the same.

On an e-reader in comparison the e-book is formatted with underlying HTML. There are headlines, paragraphs and specially emphasized words. Besides being more visually pleasing this component helps you understand the content better because you are visually guided (If a new paragraph starts it probably is because a new train of thought starts, if a word is emphasized the word probably is important, if there is a headline well there is a headline. Good to know (in the current LingQ you can only identify headlines by inferring it from the fact that this single word or group or words (the headline) makes no sense in the flow of words, you then quickly open the kindle app search for the word in the respective ebook and confirm it actually is a headline)).

Here is an example of the difference in formatting:

However in thinking about this I noticed a problem. There is a tradeoff between

  • the function of SRT files (audio and text synchronization aka time stamps)
  • The function of the various ebook file formats (EPUB, MOBI, etc.) (HTML formatting)

The file formats that allow for HTML (EPUB, MOBI, etc.) don’t allow for time stamps.

The file formats that allow for time stamps (SRT) don’t allow for HTML.

I personally always (in a process independent of LingQ) synchronize the audible audio and the kindle text with one another so I can upload this as an SRT file. But an SRT file of course doesn’t contain any HTML formatting so I would have to choose between

  • Either uploading the synchronized audio as MP3 and the SRT file but without the HTML formatting !
  • Or the Ebook in EPUB or MOBI etc. but without the synced mp3 !

That of course in unfortunate since then you can only have correct formatting (EPUB, MOBI, etc.) or synchronized audio (SRT), but not both !

To tackle this problem I thought perhaps one could upload the ebook file (EPUB, MOBI etc.) with the correct HTML formatting. Then LingQ divides the file into its paragraphs (the ones you see when you edit the lesson). Those should then be able to be exported as a CSV file or something similar with every cell being one paragraph. Then the user should be able to upload the respective time stamps (which would normally be in the SRT file), without the SRT text content (because this has no HTML). And those then become attached to the respective paragraphs. Of course this is only one option that came to my mind, to get around this problem of either SRT or HTML.

The second point concerns the maximum size of lessons. Currently on LingQ the size of lessons is restricted. This might have a lot of practical reasons as for example restricting the user-ability to use the AI (artificial intelligence) based TTS (text to speech) function that the LingQ team so kindly integrated (and pays for ! ). However for reading books on LingQ this restricted lesson size is just nerve wracking. If you upload a whole ebook it just arbitrarily gets cut into multiple lessons (based on the lesson length I suppose) that are not at all consistent with the content.

Of course as I outlined I understand the practical reasons for restricting lesson size but I thing there might be a middle ground: When importing an ebook the table on content could be used to cut the whole book into lessons that represent the chapters of the ebook. Therefore every lessons equals a chapter.

  • you don’t have the chaos of arbitrarily chosen lessons that have nothing to do with content in them.
  • But you also don’t have massive lessons the length of a whole book

Or another option would be to allow „unlimited“ lesson size. But if the user wishes to use the TTS service there could be an option of "divide lessons into smaller lessons“ or something similar that then does the processing that is currently done anyway and after that the user can manually add TTS audio, just as currently possible.

So in conclusion:

Giving the user the ability to maintain the formatting of ebooks and to allow for larger lessons (perhaps dividing by table of contents) would hugely improve the reading experience for users that primarily read books. And I believe that there are many people that predominantly only read books on LingQ. All the people I know that use LingQ mainly import their books and read them on LingQ.

  • with importing books you can actually read content you truly enjoy for its own sake and learn the language on the side (I believe people on average like books more than articles but I could be wrong)

  • with importing books you are satisfied for 2 / 3 weeks or however much time it takes you to finish the book and you don’t have to search for a new article that only mildly interests you every 15 minutes (the time it takes to finish an article)

This is of course only my own opinion and no offense to LingQ or people who read articles on the platform. The LingQ system is brilliant. I just wanted to make the case that I believe a lot of people enjoy books on the platform and so a lot of people would benefit from having the book reading experience improved. Perhaps even more people would join LingQ if there were no tradeoff to reading books on LingQ vs for example the kindle app with the added benefit of being able to LingQ (and this way learn the language way more systematically and effectively).

Thank you for taking the time to read this!

One third point that just came to mind is to also keep the pictures of the ebook.


Did not read the full post yet, but this problem is not a new one, all I had to do was - convert pdf into epub or vice versa.
Formatting issue thread
By the way, you can manually edit the lesson’s text and in that case word limit will increase.

1 Like

You can kind of add the pictures, as I did in this lesson, but it comes out a bit strange. Like this: