Automatic transcript for Croatian uses the Cyrillic script

Hi everyone,

as stated in the title, I just created my first Croatian lesson from an audio file, using the automatic transcript feature. The text created uses the Cyrillic script. It should be in the Latin script, though. How can I fix this?

Greetings

How annoying! Once again I find myself debugging lingq instead of spending my time on the languages I set out to learn!

1 Like

LingQ doesn’t expose any configuration options for the Whisper API.
To work around it you would need to transcribe the file somewhere else. You will also need a model that allows prompting if the responses flip flop between Latin and Cyrillic for Croation. GPT is probably the way to go, but can be pricey.

That’s unfortunate. I installed openai-whisper on my machine and by specifiying the language I got a perfectly fine translation written in the Latin script in less than 10 minutes. But Lingq can’t pass the language option? What a shame. :person_shrugging:

Thanks, we will look into this.

Perhaps the algorithm misunderstood the language as Serbian? Croatian and Serbian are actually just two variants of the same language.

It does the same for Serbian. Most of the time it transcribes to Latin script and some texts end up in Cyrillic for some reason

Thanks, we will have it fixed.

1 Like

Cyrillic for some reason?? It is certainly possible to write Serbian using the Latin script, but the original script for it is Cyrillic!

@DJTembo True, but for audio transcription we default to Latin script since most of the users are learning Serbian with Latin.

I read both scripts so I wouldn’t really care if

  1. Whisper chose the script consistently
  2. LingQ supported script switching. So far Latin and Cyrillic exist separately and inflate my vocabulary even further.
2 Likes

Is there any effort put into fixing this issue? I just got part of an audio file transcribed with the Latin alphabet, and part of the same file with the Cyrillic alphabet.

The language of the course I’m talking about is set to Croatian, not to Serbian. Why are we talking about Serbian? The two languages might be very similar, but Croatian is not written with the Cyrillic alphabet!

@helmuth1980 We are working on this and we will have it fixed soon.

I am no longer experiencing this issue. Thanks a lot for fixing this! I do appreciate it.

1 Like