Audio and Transcript Sync issues?

Hi there! Is anyone else having audio issues where the content they are watching is miscued from where the speech is? For instance, I’ll be watching a recorded lecture, and the speaker will be three sentences ahead (or in some instances, behind) where the text underlying is, making it very difficult to follow along. It’s irregular also, in some videos it occurs, other’s it’s no issues. Some videos correct itself when I come back to it, others which never had the issue all of a sudden start doing it.

Any help is GREATLY appreciated, this is 95% what I use the app for. Reading and following along with the listening.

1 Like

That sounds like an issue with lesson timestamps. You can report issues like that and our content team will look into it and have it fixed.

1 Like

This happens when I import my own mp3 files. Lingq does a great job of providing a transcription, but the timestamps of the transcription don’t match the audio. Since I have to set all the timestamps manually, it simply isn’t worthwhile importing audio.

Any update on this? I am finding that the imported text has not been correctly divided so it syncs with the audio. This is particularly problematic when there is a question and answer dialog between two characters in the video. Everything is correctly synced when I watch the YT video with the subtitles on, but not on the imported lesson in LingQ. Thx.

1 Like

Someone else suggested that we should click “generate timestamps”, wait for the new timestamps to be generated; close the browser completely, open a new browser session, open Lingq again, and then the timestamps are more accurate.
I’ve found that works - somewhat - but I still need to set timestamps manually for sections that have background noise or music.
I’ve started using recordings of news radio, so this approach works fairly well for me. I’ve given up using TV shows or movies. I’ve wasted far too many hours trying to import them and when I finally get them into Lingq, the timestamps aren’t even close because of all the background noise and music.
It seems strange that Lingq can do a good job of transcribing speech, but can’t synchronize the transcription with the audio. Maybe two separate software app’s are being used to create the transcription and create the timestamps, and the two app’s don’t talk to each other. The transcription software can filter out background noise, but the timestamp software doesn’t do a good job of that.

You’re right that it’s 2 different systems. And not an easy problem to solve automatically without machine learning. Transcribe is the process of taking audio files and returning timestamps and text. Forced alignment takes text and audio and guesses what the timestamps are. At this stage whisper gives more accurate results but it may be possible to implement a better forced alignment solution in future

2 Likes