Which subtitles do you prefer to import? The automatic YouTube captions, or do you import the audio-file and have the text generated by whisper. Both have clear advantages.
The Youtube captions are pure chaos without any punctuation or capitalization, and it butchers all proper names of people and places. On the other hand, the spelling of the actual words seems reliable and correct. The whisper generated text looks lovely and orderly. Correctly punctuated and capitalized, but the spelling is a godforsaken mess. I barely dare to add any words to my known word count, since I don’t trust that these are actually the correct words. I haven’t tried the other languages, maybe it is better there, but in Czech whispers inability to differentiate s and z and ch from h is causing some trouble.
I prefer YouTube captions, but before going through the lesson, I copy the text and feed it to ChatGPT with the following prompt:
Here are some automatically generated captions for a video. Correct the line breaks, punctuation and clear mistakes made by recognition software. But don’t make too much changes, so that the result still matches the audio .
Don’t know if it is the best promt ever, but so far I’m happy with the result. It give some mismatches, but more than half comes from ChatGPT actually fixing mistakes in speech, which is probably even good. And only a few of truly wrongly interpreted words. And I’m using the Rooster extension that makes copying and then patching the text super easy.
what version of AI model do you use from wisperAI?
There are tiny … large, also
large-2
and large-3 models
I think youtube not even close to the big large models that specialized on it.
The largest text I’ve given to ChatGPT so far was about ~24k characters (~30 min video). Though it may have happened that back then I’ve used the Premium version. I stopped paying for it a while ago, but don’t remember when. But I’ve definitely used free version for about ~20 min videos. It’s only that once in a while you need to click the “Continue generating” button.