Tell LingQ's importer to use Youtube's subtitles, not its own audio recognition

cspotcode · July 9, 2024, 2:33am

How do I tell LingQ’s YouTube importer to import Youtube’s auto-transcribed subtitles instead of transcribing in LingQ?

I imported a video into LingQ using the browser extension. The imported lesson starts like this:

Youtube’s auto-generated subtitles are more accurate and more useful, they start like this: (among other differences)

Note the kanji used for 皆さん as opposed to LingQ’s hiragana みなさん.

The next line starts like this on Youtube:

But this in LingQ. Note that LingQ missed the word えっと:

The problem is, despite better transcription being readily available – and presumably cheaper for LingQ to harvest instead of generating itself – I’m stuck with the inferior LingQ transcription.

Am I understanding this problem correctly? Is there a way for LingQ to import Youtube’s transcribed subtitles instead of generating its own?

roosterburton · July 9, 2024, 12:54pm

You cant change the importing preference on the official LingQ import tool. It will always attempt transcription if the video has autogen or no subtitles.

Your free options are to use a tool like Youtube DLP/browser techniques to access the subtitle files or to petition the LingQ staff to add a checkbox.

For the superior experience, there is always Rooster Tools

Obsttorte · July 9, 2024, 3:41pm

You can also use Youtube Playlist Downloader, a freeware tool that allows you to download videos from Youtube, including whole playlists, as well as the subtitles. You have to specify the language and the program will download the subtitles and store them in a seperate srt-file, including timestamps. You can also let the program automatically convert the downloaded video into an audio file if you want to listen to it while underway.
Example srt-file excerpt

1
00:00:00,000 --> 00:00:05,030
皆さん、こんにちは！お久しぶりです。元気ですか？

2
00:00:05,030 --> 00:00:09,280
今日は選挙の話をしようと思います。

3
00:00:09,280 --> 00:00:15,140
2024年は世界的に選挙の年と言われていますね。

4
00:00:15,140 --> 00:00:21,440
まさに今、日本でも東京都知事選挙が行われています。

5
00:00:21,440 --> 00:00:26,320
「知事」っていうのは、県とか府とか

6
00:00:26,320 --> 00:00:29,470
都、東京都のリーダーのことです。

7
00:00:29,470 --> 00:00:31,290
知事。リーダーですね。

8
00:00:31,290 --> 00:00:34,750
で、この東京都のリーダーを決めるための、

cspotcode · July 9, 2024, 4:25pm

I can download subtitles files, but can I tell LingQ to use the timestamps from those files? So that I can click a line of dialogue and the player jumps to that position?

How does Rooster Tools implement this?

I’m wary of bad word splitting and transliteration in Rooster Tools because I’m not sure what tool it uses to generate those. I also use a variety of extensions in Edge (Chromium) so I’m concerned that Rooster Tools will be buggy or support will lag behind Firefox.

Obsttorte · July 9, 2024, 5:41pm

Under import, select the first option. Via add file in the center you can select the srt file. Then add the respective Youtube link under video and create the lesson.

cspotcode · July 9, 2024, 8:33pm

@Obsttorte thank you! This worked. I noticed you have the “Optimize with AI” features disabled. I have them enabled to get better word splitting, but it means I have to wait a couple minutes between importing and being able to use a lesson.

Do you recommend disabling “optimize word splitting using AI”?

Obsttorte · July 10, 2024, 4:34am

It was just for demonstrational purposes. I deleted the lesson afterwards.

I mainly study Korean and only have started the Mini Stories in Japanese to be able to see the Kanji/Hanja in context and for a bit of diversion. Therefore I cannot say a whole lot about the quality of the word splitting algorithm. Considering how “well” it deals with the pronounciation I wouldn’t completely rely on it though either way.

cspotcode · July 10, 2024, 4:42pm

Thanks! I tried disabling AI, and LingQ showed a popup recommending I re-enable it on my next import. So that answer’s that question: I’ll keep it enabled. IIRC, when I first got LingQ, I saw noticeably better results with the AI splitting.

cspotcode · July 10, 2024, 4:48pm

YouTube auto-generated subs vs LingQ transcription

After experimenting more with the SRT subs, I hit some issues. I now understand the choice between LingQ and YouTube transcription is more nuanced.

YouTube transcription has sub-word-accurate timestamps. That is to say, a single word may appear in 3 parts synced to the speaker saying those 3 parts of the word. It might split these parts across multiple lines, because in this style of subtitling, the newlines are not important: words appear as soon as they’re spoken.

Youtube also does not generate commas and periods, LingQ does.

I used downsub.com to convert the YouTube subs to an SRT for importing. This SRT splits the YouTube subs by line, meaning a single word gets split across 2 lines, and complete sentences are not kept on one line. Each line may have the end of one sentence and the beginning of the next.

So the process of importing YouTube’s auto-generated subtitles loses a lot of the timing information from YouTube and doesn’t break up the text on sentence boundaries.

Now that I understand this, LingQ’s default transcription is better than I initially gave it credit for.

Obsttorte · July 10, 2024, 6:47pm

You don’t necessarely have to keep the timestamps. You could remove those and the linebreaks. Although doing this manually is probably a bit cumbersome.

cspotcode · July 10, 2024, 7:44pm

True. In my case, I very much want to keep the timestamps so I can repeatedly click “play” and practice listening to each sentence spoken by a native speaker. But I want those timestamps to align with the start of each sentence.

Plus if I removed linebreaks, there’d be no punctuation to split up sentences, it’d be everything on one long line in LingQ. I’m pretty sure LingQ’s importer doesn’t detect sentences and generate periods between them when importing an SRT.

LingQ’s whisper transcription at least detects sentence boundaries and generates timestamps on those boundaries.

Obsttorte · July 11, 2024, 5:49am

There are points or question marks at the end of the sentences, at least in the subtitles I’ve used as an example. One possible approach would be to merge those lines that form one sentence, so

8
00:00:31,290 --> 00:00:34,750
で、この東京都のリーダーを決めるための、

9
00:00:34,750 --> 00:00:38,040
東京都の知事を決めるための選挙が

10
00:00:38,040 --> 00:00:40,320
今行われています。

becomes

8
00:00:31,290 --> 00:00:40,320
で、この東京都のリーダーを決めるための、東京都の知事を決めるための選挙が今行われています。

for example.