I love importing youtube videos into LingQ for my language learning. However, there are two major issues with the approach:
One. There is no way to play just the audio of a youtube video (for listening only learning). You are stuck with either watching the video OR using LingQ’s built-in text-to-speech which is great for beginners but for intermediate and above you should really be listening to actual speakers.
Two. LingQ pulls the youtube closed captions and at least for my computer, the transcript breaks up sentences into strange blocks, kind of destroying the sentence mode study method. I also notice quite a few errors in the youtube captions.
I’ve found a solution that works very well for my needs. It’s a small python code that, one, downloads the youtube audio as an mp3 file and, two, (this is the important one) uses an italian language model to (very accurately) transcribe the youtube audio into legible full italian sentences. The model is called faster-whisper, it is free and in the roughly six long form videos I’ve transcribed I estimate its 98% accurate. Even on very rapid content with music and other distractions in the background it’s impressively spot on.
My process is this:
- Find a video I’m interested in
- Copy the URL
- Paste it into my program
- Wait for the program to convert the video into an audio track and transcribe the audio into a .txt file
- Add the lesson to my LingQ as a custom lesson importing the .txt file and then uploading the .mp3
I’ve been using it now for the past 2 days and I’m really happy with the results.
If there’s any interest in the community I’m happy to share my python code. Full disclosure, I am NOT a programmer but have some programming experience. I was basically able to get microsoft copilot to write the entire script for me after several passes.
Happy language learning.