Update - New Whisper Automatic Transcription, Deepl integration and more

Have you ever had that great podcast or audio content that you wanted to learn from but were stuck because you had no transcript to import into LingQ? Well, happily, those days are gone!

Thanks to some help from some of our members on the forum we are happy to announce that Whisper AI has been integrated into our Import page so that a lesson can now be generated for any audio file that you might have. Give it a try, you will find an option to Generate Transcript at the bottom of the main text area when importing.

And, stay tuned as we look at what else we can do with this powerful and continuously improving technology.
We have also been busy with some other significant updates including:

  • Importing of text only with the import extension. For those times when you want to import a section of a page only. Just highlight the text you want and import.
  • Sentence Review on all platforms. The new Sentence Review is available on all platforms with the ability to see all sentence vocabulary in a list on the page available as a setting.
  • Deepl has replaced Google Translate for sentence and default translations. Deepl just seems to be better overall so we have made this switch.
  • Known Words targets per level adjusted to account for language variances better. The targets were too uniform before not taking account for variances in language. We want the targets not too easy, not too hard. We hope they are now just right!
  • Loading of long lessons in chunks. Longer lesson performance has been an issue for power users especially. Performance has been significantly improved here which means we no longer need to split up longer imports. You will notice that lessons created using the Transcript Generation feature no longer have this lesson length restriction or auto-splitting into parts. This will be coming for all imports soon.
  • Transliteration for additional languages. Languages written in non-Roman scripts now have an option to see Transliteration. Should help newbies get a toehold!
  • Swahili has been added as our first African language

In addition, the usual assortment of bug fixes on all platforms. We hope you enjoy some of these updates and, as always, look forward to your feedback.

87 Likes

Thanks @mark! What’s the .mp3 file size limit for uploads? It may be useful to specify it somewhere on the page for other users.
EDIT: Also, if you are considering removing auto-splitting of imports into 2,000 lessons, I would recommend you have a look at changing the ‘% New Words’ metric to take into account the length of the lesson. Otherwise longer lessons will have an inflated metric compared to shorter lessons. As an example, for me now, the entire book The Adventures of Pinocchio has 36% New Words, but each 2,000 word lesson has much less at maybe an average of ~17%, so it’s actually doable. Consider Unique New Words / total words * 100. For this book, this metric would give out 5.66%, which means I would only encounter a New Word every 1 in 20 words, which I know is doable.

4 Likes

Sounds really exciting !! Thanks to everyone who made this possible.

Cannot wait to be able to import books without having each chapter split into many different parts. This cannot come fast enough. Keep us posted when it becomes available.

6 Likes

Wow! Great progress!

  1. Deepl >> Google Translate indeed. However, GPT >> Deepl now!
  2. The transcriber works great! As it adds sensible punctuation, it would be cool if it could be used for youtube videos that offer punctuation-less transcripts that are an absolute pain to go through.
  3. The long form lesson feature is a very welcome addition. A suggestion would be to now offer a “Split Lesson” option in the Edit Lesson mode to allow manual, user partitioning (to track the chapter structure for instance).
9 Likes

This is awesome, thanks everyone!

3 Likes

What model size is used by the automatic Whisper integration?

3 Likes

Wowww!!! Whisper added to Lingq. This is amazing!!! Thank you!!:heart_eyes:

5 Likes

Awesome.

Just tried highlighted text and it works great.

I have a doubt for the longer lessons though. When I use the iPhone, often the app loses track from where I was relistening a lesson with the synced text. I listen to the audio, and re-read the white pages to focus on yellow words. Longer lessons would mean a mess trying to catch the audio with the text. It would be better to have the option to split. Imho.

2 Likes

Awesome news Mark! These are all fabulous, especially removing the size restriction of lessons and the Whisper transcription for podcasts!!! Loving it!

7 Likes

Now we’re doing some cool coding, awesome feature upgrade.

5 Likes

Can you use whisper with the iOS app importer ? For example. Importing from a podcast app?

3 Likes

Great updates! Thank you to you and the team!

4 Likes

No, not at the moment. We have some plans to try and integrate podcasts in some way. Stay tuned!

7 Likes

We’ll see if we can enable something like that. An option to split after x words.

4 Likes

Whisper has a limit of 90 minutes. I don’t see why the percentage should be affected by the length of the text. The metric shows the percentage of unique words in the given text that are new for you.

2 Likes

way to go! what an update :slight_smile: :slight_smile:

2 Likes

Thank you! Can you help me – how would I upload a podcast from Apple Podcasts?

1 Like

Great. Also happy to see Deepl replace Google Translate.

6 Likes

At the moment you would have to be able to download the audio file and then upload it to lingq . I’m not sure if that’s possible from Apple podcasts. I think you would have to do it from the podcast on website if they made downloads available.

1 Like

For #2 - just use one of the various youtube to mp3 converters to create an audio file and upload that instead of doing the normal import from youtube. Then use the generate transcript function.

This one appears to work:

4 Likes