I’m not sure if I’m going about this the wrong way, but I’ve spent quite a lot of time searching for Croatian YouTube content with Croatian subtitles and I’ve had almost no success.
Is this just a characteristic of certain languages or are there easier ways to discover suitable videos to import in the target language?
Without downloading the video to MP3, is it a requirement that the videos have closed captioning in the target language to be able to import directly from YouTube?
Well, good luck. Youtube does not autogenerate subtitles for Croatian. But you can use the filters: For instance search for “sretan božić svakome” then activate the filters Subtitles and CreativeCommons and then you get four entries, one of them is in English and another one in Italian, even Youtube comes up with titles in Croatian. You are left with:
Božićni dar / The Christmas gift (Official video)
and
Priče iz Davnine by Ivana Brlić Mažuranić | Croatian audiobook | Literature for Eyes and Ears
Stuff I already found and added. Since I am not on Premium, it had to be Creative Commons or Public Domain so that I could share them.
I think, I downloaded the videos and the subtitles, so that I could extract the audio and uploaded all of this manually. Definitively, for Priče iz Davnine, as I cut the video into the separate chapters so that I could upload them separately.
Narravno, Priče iz Davine znaći riječi iz davine. And so you have to deal with a lot of words you don’t know. This way you can’t really choose what you import.
The alternative is to use Whisper yourself to transcribe it. However, I still had to check these subtitles, since they me be out of sync or a sentence is missing. Here or there a wrong word. That’s fine if you don’t share these things. However, if you want …
If you transcribe a song and you already have the lyrics then you only need the time stamps but that is also a little work.