I’ve been using lingq for some time now, and I’ve gotten to the point where I am reading audio books while reading. One would think the process would be simple - import the book chapter by chapter, and attach the audio file.
Only, it’s not. Lingq’s site just is painfully unreliable and slow. It can take 10 minutes + to create a chapter, of about 25 minutes of content. There’s no transcription here. It’s just text attached to an audio file. The only thing to do is break it into words.
I’d written a script to do some of this off line, but rumaging through the text of the book to find the appropriate text isn’t so easily automatable so I’ve been trying to do it manuelly.
I think the crux of the problem is performance. It’s painfully slow, and it has a high % rate of failure (e.g. just red error text saying something failed), meaning you have to start the process all over again.
I doubt the limitations on this will be solved soon, so why waste the time (it’s been this way ever since I joined in 2017)? Read in LingQ while listening to audible (or whatever your audiobook is on).
I don’t think it’s worth getting the audio onto lingQ. Just listen to audible or podcasts or whatever it is and read on lingQ at the same time, and just manually add in the time on LingQ. It’s so much simpler that way.
What limitation? It’s just a text file. I literally need nothing more then a split of the text.
If you are learning a European language maybe this is easy to listen to audio book, but when you are learning Japanese it’s a whole other level of difficulty, you have to know the Kanji and Kanji can have multiple pronucations and pitch accents. It’s’ much better to hear it while trying to read it, and have your dictionary handy to cross reference.
For me it usually works well if I either import the transcript as an srt file or let LingQ do the transcription, after uploading the audio file for one chapter. Recently I have favoured the latter option. Works well for podcasts and YouTube videos, too.
But if it’s a plain text file, I suspect LingQ doesn’t accurately generate the timestamps. At least it didn’t last time I tried.
With LingQ, you learn the art of not wasting time! Just do what works if you want to keep your sanity for a long time. Or bugs will haunt you for the rest of your life. You’ve been advised!
Transcription makes alot of mistakes, unfortunately, plus I max it out pretty much instantly. I wrote a python script to do this offline (it would take half a day, but worked). This process did seem to work but the problem is it’s highly inaccurate, so I’ve been trying to attach the original text from the book with it. I haven’t yet tried automating removing the text from the epub and associating it with audio file(s) just because often the format of the epub doesn’t match the way the chapters of the audio book are labeled in any consistent way (even differeing on the same series from the same author). So I’d been trying to do it by hand with the web interface on lingq. I’m close to giving up on that and making a bunch of files and doing it via a script. But it’s def more work.
I don’t need time stamps, I can follow along just fine without them. It doesn’t always generate them anyway (it appears to be an asyncronous job so you can still read a lesson while it’s there.) The only thing thing it really needs to do on import here is just break the words up so I can cross reference the words and phrases with my dictionary. Unlike many other languages, Japanese does not have spaces between words, and the definition of a word in some cases gets fuzzy.
I don’t understand why it’s failing. It should’t be failing because of timestamps, it shouldn’t be failing because of transcription.
The limitation that you are describing in your post. If I’m not misunderstanding, you are splitting the book up chapter by chapter, rather than just importing the whole thing and allowing LingQ to split as it sees fit. You are then doing something to split up your audio file (it’s not clear the source of the file, but regardless). It sounds like you are hoping LingQ would have a solution to all of this, although it’s not clear what you are asking for exactly. All of these sound like “limitations” to what you are attempting to do.
So my point is simply that, LingQ has not done anything to solve any of the limitations that I think you are implying in the 8 years I’ve been using them. It’s not likely that they will change any of the current behavior. So my suggestion is simply to import the e-book as is. Read in LingQ doing all the look ups you normally would using LingQ, while listening to the audio separately from LingQ (via audible, or mp3 player, or whatever your source is). You can pause the audio as needed if you need to look something up.
You’d be spending more time learning than fiddling with individual chapter splitting which sounds like it is taking up a good portion of time.
The last book I imported actually did split up by chapter automatically surprisingly (although longer chapters were split up additionally). I haven’t tried another since, so I don’t know if I got “lucky”, but in the end it doesn’t matter all that much in my opinion.
You are completly misunderstanding. I’m simply trying to create a lesson with a fairly long audio file (20-40 minutes), attached with a text transcript of the book (rather then a transcribed one, since those are highly inaccurate). To be clear, I am not trying to attach the entire book or entire audio file. It’s already the audio file for just that chapter, and the text for just that chapter. It’s no different then generating a lesson. At all. Except I have a lot of them.
In these cases, I’m simply trying to regenerate a lesson by attaching the actaul text from the book rather than the AI transcribed text. And yet it fails at about 75% rate. Maybe it’s regenerating lessons that’s totally busted, I don’t know.
Hmm, the transcription accuracy does vary by language. It’s pretty accurate for Portuguese, which I mainly use it for. I haven’t managed to reach the LingQ limit myself yet, as it takes me quite long to work through each lesson, and I only import one or two at a time.
It sounds like your issue with importing a lession may be specific to Japanese. Have you tried first importing only the text, then editing the new lession and adding the audio to it?
I buy the book, use Calibre software and import the text file as one file.
I manually skip the blurb pages, chapert headings and other bits if I haven’t remembered to delete them.
I don’t care how it organises chapters.
That worked for Greek, Irish and French.
Then I listen using Lingq’s text to speech.
A pretty specific but 5 minute process with a few minutes of Lingq processing it.
Not ideal, but an attitude of getting on with it can help.
I accept for some languages, there may be other complications.
What I can say is for myself, but uploading books chapter-by-chapter is all I’ve done in LingQ for more than one year and a half now, rarely less than three times a day, and after more than 1500 imports I can count on my hands the amount of times I got a problem. As long as the audio is under 90 minutes or so, I upload the first chapter with the full audio, and when the chapter is done I snip the first part and go on like this until the book is over, uploading the cut part to the finished lesson, and the remaining to the next one, usually both at the same time. I never regenerated a lesson, are you uploading the audio and then adding the text? Try to do the opposite.
Have you tried importing the text first and then afterwards add the audio? Then regenerate? (It seems based on your description that you are importing the audio first, but I may be mistaken)
To update, regenerating it fails at a very high percent. If I create an entirely new lesson, then create, then delete the old one, it seems to mostly work. It seems that the regenerate lesson button does not work reliably.
Huh. I’ve uploaded several books into LingQ and didn’t have these problems, especially a failure.
I do have a gripe that LingQ should either “not” break my content into chapters, or should recognize the chapters which are already programmed into a book file, because it will break up a long text into what appears to be several lessons and in the “wrong spot”. For example one lesson it created was just one sentence and the rest of the chapter was in another lesson. I have never gotten this issue when uploading .srt files.
What I ended up doing is just converting any epub file into a .txt file and uploading it to LingQ as a .txt.
I always manually add my books to LingQ one chapter at a time.
I paste the full text of the chapter into the lesson and add the audio. Once both are there, I click the “Save and Generate Lesson” button. If the chapter is too long, the program automatically splits it and copies a few sentences from the end of the first part to the beginning of the second to maintain continuity. It also automatically pre-fills all the translation entries for sentence mode with better translations than are created when generated on the fly.
Yea it is a pain, I find the audio portion of importing books the biggest pain too.
Just to share the problems and fixes I still have to do (very old issues that are still around) using Calibre for French ebooks before importing :-
Get rid weird characters used for spaces and replace them with normal space.
Get rid of space for french quotation mark (« and »). So remove space after this character « and before this » . To prevent » ending up alone on its own sentence page.
Some books have new line breaks in middle of sentence for layout purpose. I have to get rid of new line breaks (\n) so that 1 sentence don’t get split up into multiple sentence in LingQ.
Check TOC (table of content ) is correct, I think LingQ uses TOC to split chapters into lessons.
If I use audible audio, I have to make sure the start and end of each of the audio file match the book chapters. Usually I have to split the book chapter into 2 to match the audio. Sometime I have to split the audio. This splitting task is very time consuming, especially for language learner who still learning the sound and words.
Once book is finally inside LingQ.. the last step is usually audio sync issues. Rooster lesson editor helps a lot to fix time sync. I use sentence mode, so perfect audio sync is important for me, especially for a language that has silent ending for some words. I might mistakenly think a word has a silent ending if the audio cuts off too early.
The fixes for items 1-3 can be fixed in about 5-10 mins using search and replace. I do these fixes before importing. I feel LingQ should just fix these issue on their importer end so new user don’t have to deal with it and figure it out. It seems like it can be automated.
I do sometime encounter error importing, red error message, something about DRM. But it works the 2nd time I import the very same file. No idea why it happens.
Anyway, the audio fixes (item 6 and 7) takes up most of my time. It takes up so much time that I have been procrastinating importing my next book with audio but ever since LingQ added AI generated audio, I am just using the AI generated audio for books that I don’t already have the audiobook. I still prefer native audio though.
Your issues seems to be Japanese related. Hope you figure out the import issues or work arounds, I won’t expect a fix from LingQ anytime soon.
Hsingh, I think you might be better off copying and pasting the text of each chapter separately, rather than importing a file. That way you know exactly what will be in the chapter.
Where do you get the audio from? With Epubor’s Audible utility, it will separate the file into separate MP3 files that match the chapters. I use Audacity to remove extraneous stuff from the beginning and the end and to slow down the audio by about 10% sometimes.
Once both the audio and the text are in the web page, LingQ will create the lesson in a fairly straighforward way. It does for Spanish, but I don’t know about other languages.
I tried uploading files, but I never liked how they came out, and adding the audio after the lesson was created didn’t work as well, especially if LingQ had to split a chapter.