PSA: disable all Japanese transliteration, use text-to-speech

As far as I can tell, LingQ’s transliteration should be completely disabled, because it’s often wrong, and the text-to-speech engine can pronounce words correctly only when it’s disabled. Trust the voice, not LingQ’s annotations.

If you leave transliteration enabled, the voice engine is given the (often incorrect) transliteration, meaning you learn the wrong sound for kanji, potentially the wrong pitch accent or emphasis. (?)

Here is an example:

The transliteration is incorrect, it says “gai” but the proper pronunciation is “mochi”

In the sidebar I have intentionally disabled transliteration to force the voice engine to speak the original kanji.

Pressing S says “mochi” which is correct, not “gai” which is incorrect.

Why does this happen?

When you press S to hear the word spoken, LingQ does one of 3 things depending on your sidebar’s transliteration configuration:

  • if you have hiragana transliteration enabled, it voices the chosen transliterated hiragana, so the voice engine does not see the kanji, cannot pick the correct pronunciation
  • if you have romaji transliteration enabled, it voices the chosen transliterated romaji, so the voice engine sees the roman alphabet and will even pronounce things using English rules, not Japanese, when it believes the word is English!
  • if you have transliteration disabled, it voices the original text w/kanji. This produces the best results, as far as I can tell

I have already reported the romaji issue here: Japanese voice for words incorrect - #6 by cspotcode

Turns out, issues also exist for the hiragana transliteration.

1 Like

Additionally, kanji tells the voice engine which pitch accent to use. When the voice engine speaks hiragana – which it does when you have transliteration enabled – it gets the pitch accent wrong.

Here is an example. Import this lesson into LingQ:

髪 (かみ) - This word means “hair” or “hairstyle”. It has a pitch accent pattern where the pitch rises on the second mora (かみ【LHH】).
神 (かみ) - This word means “god” or “deity”. It has a pitch accent pattern where the pitch remains flat across all morae (かみ【LHL】).

You can confirm the differences in pitch accent by playing the audio samples from native speakers on Jisho:

In LingQ, disable all transliteration and listen to text-to-speech for the two words. Notice the difference: the voice engine understands the kanji.

Now enable transliteration and re-listen to text-to-speech. Notice they are now voiced identically – incorrect! – because LingQ is not voicing the real word, it is voicing the hiragana transliteration.


I agree that the transliterations are incorrect, but the problem runs deeper than that.

First consideration, taken from your example: 街 is pronounced “machi”, not “mochi”, which just goes to show that transliterations would be relevant for those who are not yet accustomed to the sounds of Japanese. That is, if transliterations worked properly. However, beyond that, the TTS is not always correct about how to pronounce a kanji. For example:

The 方 kanji has two pronunciations: “hou” and “kata”. In this context, the correct one is “hou”, but the TTS pronounces it as “kata” instead. I haven’t confirmed, but I suspect lone kanji are always pronounced by their kunyomi reading.

There is also the problem with parsing. Unless you do the tedious work of adjusting every single improper parsing, you will eventually get something like this:

The correct pronunciation is “shukuei”, but if the kanji are parsed separately, the TTS will pronounce it as “yadoei”, probably because it uses the kunyomi reading of each kanji.

I originally wrote a paragraph about possible situations where the TTS would just be wrong, and I’m editing this because I might have encountered one such case.

Every dictionary shows the pronunciation as being “ninnitaeru”, but the TTS pronounces it “ninnikotaeru”.

In short, the transliteration is not reliable, but neither is the TTS, regardless of the script. Pronunciation could be regarded in the same way as the meaning of words, with a LingQ being created if necessary, but even that only really works in the second and third cases, since your LingQ won’t be able to differentiate the context. The only solution that I’ve found to learning how a word is pronounced is more immersion in native material with natural voices.

1 Like

Thanks for replying.

With the AI-enhanced word splitting, do you often get the kind of parsing errors you describe? I’m still a beginner, so I definitely worry when LingQ makes mistakes that I’ll be none the wiser.

I’m getting smarter at avoiding lone kanji, for example if I suspect it’s a counter, I highlight it as a phrase including the preceding digits, to hopefully get a more accurate pronunciation.

1 Like

The re-splitting reduces the parsing issues, but it has the problem of ruining some special characters, mostly quotations. Regardless of the parsing, what I’d advise is to rely on natural speech. Try to use lessons that contain real voices, guide yourself by what is being spoken instead of the TTS, that’s the only reliable way to learn the pronunciation. You may also try some extension like 10ten or yomichan, they may be helpful for understanding both the parsing and the pronunciation.

1 Like