In the original timestamp this word finishes at approximately 46 seconds. LingQ concludes the sentence at the previous line end timestamp. I assume because LingQ is using segment information rather than word timestamp information and has come up with this solution that doesn’t work very well.
At this point LingQ is only useful as a dictionary service. Who would want their files to be destroyed like this?
This solution assumes you want to keep formatting the subtitles to make them look nicer for learners. My preference would be you allow us to import without your attempts of formatting.
I actually can’t understand any of your explanation here. The videos don’t play and the image is unclear to me.
Are you saying those timestamps shown are for those bits of text? And, those come from YouTube? Obviously, those aren’t formatted into sentences so using those causes us problems. We are attempting to format the text into proper sentences and paragraphs instead of just showing segments of text as they appear in Youtube.
This part was obvious, and im not concerned about that. I am concerned about the poor redistribution of the timestamps. If you are going to pick and choose words from different timestamps then you need to re approximate what the timestamp was. You cant just use the previous YT end subtitle time and say job is done, this approach is not valid for all languages using YouTube autogen.
Thinking how this impacts LingQ…
Sentence mode is completely broken. useless. Listening mode is out of sync. mostly useless…
If i wasn’t so invested I would have left already
One option is for us to simply not attempt to touch the formatting. To just show the chunks of text as YT provides it. We get many complaints from users about unformatted text but perhaps from YT, we shouldn’t touch it. We use AI to try and improve it but it really can’t do it properly without the audio which can’t be downloaded. We can offer AI reformatting as an option which people can try although I guess we would inevitably be asked for a way to undo this at least some of the time.
My inclination is to leave it as is. That is what YT provides so that is what you get.
My question still stands: Maybe you can help me understand why I might want a video that has proper Chinese and English subtitles reformatted into sentences (that don’t match, or even line up with, what is being said in the video)?