Cantonese Mini Stories Significant Issue

I’m on Cantonese MiniStory 14, and it’s become increasingly evident that the writing/kanji and the audio do not match as the stories progress. Probably 10-25% of story 13 basic vocabulary doesn’t match up, for example. This is a real detriment to the usefulness of learning with the mini-stories, as there is no one-to-one match between the recorded audio, the characters used, and the vocabulary word recordings, which ends up costing a lot of time and reference to dictionaries and other learning tools to try and sort out. Four years ago, the following post started a relevant thread, but I see no confirmation that all the stories were in fact corrected.

LukeTruman4 years ago
Hey, I was just looking at the Cantonese library to see what new content people have been important and I found the new mini stories: Conéctate - LingQ
Unfortunately the mini stories are written in standard written chinese and the narrator is speaking in Cantonese. This means what is being said and whats written down are two completely different things (but with the same meaning).
When I was a beginner this confused me a lot so I really think it would be better to get the stories written in Cantonese, and not standard Chinese.

Thanks for reporting, we will investigate that.

As I’m just about 1/4 through the mini-stories, I’m not sure the issue will for certain persist, but thanks for checking so at least we know what the actual status is.

I want to confirm that the text does correspond to Cantonese phonology using traditional characters. However, I want to point out several errors due to missing words.

1) 子怡 喺 餐廳 。 子怡 係 唔 係 喺 屋企 ? 唔 係 , 子怡 唔 喺 屋企 。 佢 喺 餐廳 。 (Missing word in the text)

3) 子怡 唔 識得 俊軒 , 佢 係 一個 朋友 嘅 朋友 。 子怡 係 唔 係 識唔識得 俊軒 ? 唔 係 , 子怡 唔 識得 俊軒 , 但係 佢 係 一個 朋友 嘅 朋友 。

The narrator should have included 係 唔 係 instead of using 識唔識得 because of the answer 唔 係.

8) 俊軒 話 子怡 笑 起嚟 好好 睇 。 俊軒 係 唔 係 話 子怡 笑 起嚟 好好 睇 ? 係 , 俊軒 話 子怡 笑 起嚟 好好 睇 。(Missing word in the text)

Is the report button under the dictionary definition the right way to report the errors within the lesson?


When you click “report” on a dictionary definition, it will be hidden from other users. This is global and not related to a specific text like the mini stories.
Errors in LingQ’s content are best reported to them directly, like done in this thread. Else, you can use the report button (under the three … ) in the course view.

Are you a Cantonese speaker by any chance? It would be great if a native speaker could proof read / listen these stories, that way learners could have more confidence in the material.

1 Like

It’s nice to know that we can report the mistakes from the lessons in the course view. I can help as a fluent speaker of Cantonese. Anyways, there isn’t a significant issue with the text that would hinder learners from using it as a reliable resource. The more serious problems lie in the language parser Lingq utilizes for Chinese characters in any dialect and the need for standard definitions for words in context, especially if the character has multiple meanings.

By the way, I have encountered similar mistakes several times in Lingq stories for Korean. I just moved on to other lessons because they did not impact my learning.

1 Like

Hi – so I’m circling back after a month. I’ve confirmed with a native Cantonese speaker that the text and the recordings simply do not match. This is incredibly frustrating for a new learner. Take lesson 11 for example. There are at least 4 different words that the speaker has apparently decided unilaterally not to follow how the text is written. For a new learner, as the mini-stories in Cantonese are riddled with this practice though-out, it’s fairly useless, demotivating, and unprofessional, especially given how much Steve Kaufman has actively marketed the mini-stories on his Youtube channel as a great way to start out. Can you imagine the reaction if TeachYourSelf, Assimil, Pimsleur sold their materials such that the text and audio didn’t match?

Hello, I think we shouldn’t be too harsh with LingQ, these stories have been translated and recorded by volunteers, therefore I think comparisons with professional textbook publishers are not really appropriate. Personally I don’t really like the mini stories, but am nonetheless grateful for anyone going through the trouble of creating content for the libraries.

Generally I have found LingQ to be very forthcoming when it comes to correcting possible errors, unfortunately you haven’t mentioned the exact instances where errors occur, so I’m not sure they’ll be able to fix them easily. The best way is probably to send them a corrected transcript via email. Thanks.

This is where you err bamboozled - regrettably big time :slight_smile:

LingQ is a language site selling languages. If you’re selling languages, the languages should be correct.

The mini stories should be 100% and not riddled with mistakes.

LingQ relies too much on free content provided and recorded by volunteers with very little - if anything? - provided by professionals.

The mini stories should have been translated by professional translators. There are many like you who do not care for the mini stories and that’s fine - but the solution is not to be grateful for anyone going to the trouble of creating content but to give the mini stories a big overhaul once and for all.

1 Like

As a native German speaker, “bamboozled”, you can judge for yourself.

LingQ is a language site promoting language learning, yet unfortunately it exposes learners to mistranslations, errors, and at times - albeit infrequently - unnatural language,

Here’s a link to the Update on Hindi 2022 forum where you can read the comments on the mini stories yourself.

The post that got 19 likes relates to the German mini stories suggesting translations suffer from being unnatural.


Hi Maria, I agree, in an ideal world LingQ would take ownership of at least a subset of content in the libraries and make sure this is of the highest possible quality. I also believe this could help LingQ become more popular and leave the niche it currently occupies; here are some of my thoughts on the topic:

For better or worse, LingQ doesn’t consider itself to be in the content creation business. I believe the only thing they pay for are the LingQ podcasts. So the current situation is that basically the entire content side of the business is carried by unpaid volunteers. I personally don’t see this changing any time soon.
Further, while I believe the situation is not ideal, it is fair by LingQ not to charge for library access, i.e. anyone with a free account can access all content. This should prevent anyone from having unreasonable expectations in the content.

Regarding the German mini stories, yes, I’m aware of the situation, I guess the problem originates from using Google Translate, hopefully the next unpaid volunteer will choose DeepL instead.

Coming back to Cantonese, in my opinion the Cantonese mini stories are actually pretty good, yes, occasionally the transcripts contains extraneous words, some have been put in [brackets] etc. I don’t know why this is, but I have to say I hardly notice this. I feel these stories are good enough to go from zero to something, but then (after one or two weeks) you have look for something else anyways. So, I’m willing to tolerate a certain amount of ambiguity, perfection is elusive.


bamboozled, I agree with you too.

LingQ pays content providers in the form of points which invariably get retuned to LingQ as it’s impossible for small content providers to cash them out and ultimately ends up being work provided for free. Having said that, larger content providers i.e. Veral, evgueny40 et al get points each month that they can convert to cash. Librarians and lesson editors provide their work for free as they are volunteering and not providing content.

In case you hadn’t realised, non-premium members are not permitted access to premium forum threads so I cannot read what you’ve written in this thread.

I’m glad you’re happy with the Cantonese mini stories :slight_smile:


I made the revision suggestions for the first ten lessons, which still need to be updated. Sometimes, the texts are natural and grammatically correct, even with different word choices and missing or extra words. I saw minimum mistakes unless another native speaker with keen eyes spotted more errors in the text.

Treating Lingq as a platform for integrating resources is better than expecting it to be an excellent content provider. Sometimes seeing the forest and missing one or a few “negligible” trees could be a better choice.


So it is I – as a beginner Cantonese learner, paying for the service (for something like 10 years now I believe (so if I recall, given past rates, that’s thousands of dollars) – should be compiling a list of discrepancies and providing them to LingQ to correct? Exactly how am I supposed to do that when I’m just beginning and most of the language looks and sounds like jibberish at this early stage? Or should I ask my Cantonese friend – who doesn’t even use LingQ-- spend their own hours going through and compiling the list to hand to LingQ? Or is it that I should I go out and actually hire someone to do it professionally? This all seems unreasonable to me.

Steve pushes these very mini-stories as a great way to start out with a new language, in further marketing the idea that LingQ is the best language learning platform in the world: you really don’t see any inconsistency with these claims and what a beginner, paying language learner can expect when working their way through this material?

It’s not that the Cantonese “transcripts contain some errors” – it’s that there is a mismatch between the audio and the transcripts. The written material should be modified to match the audio. The core idea behind LingQ is reading and listening, including at the same time, So the marketed, especially beginner, materials ought to enable one to do that at the very minimum.

@dimguy We’ll do our best to have them improved asap!


Yes, I have confirmed that the audio sync is off, unfortunately automatic timestamping doesn’t work in Cantonese. So someone from LingQ will have to go though them one by one, this is a tedious and time consuming task.

This seems to be more related to the word splitter and not to the mini stories per se. This is a known problem of LingQ in Chinese and Japanese, I don’t think this will ever be solved. Maybe try making phrases.

Can you give an example of this? I just went though the first 25 stories and can’t seem to find this. Sounds like a serious issue.

These stories were created by volunteers, I don’t think there was any involvement from LingQ.

Just going through the first 11 so far:

Mini Story 1/
-Sentence view audio not matching sentences
Mini Story 2/
-Sentence view audio not matching sentences
Mini Story 3/
-Sentence view audio not matching sentences
-“我每日都喺度重复噉樣做着同样慨事情” vs “…嘅事情?

  • grouping issues like “係嘅” requiring trashed word or “裡面” one place then “裡 面” not being grouped etc.
    Mini Story 4/
    -Sentence view audio not matching sentences
    -”佢喺學校係個好學生” audio seems to be “佢學校係個好學生”
    -”係呀佢鐘意返學” vs “係 嘅…”
    Mini Story 5/
    -Sentence view audio not matching sentences
    Mini Story 6/
    -Sentence view audio not matching sentences
    -”佢哋嘅仔亦同佢哋一齊睇電視” vs “佢哋嘅仔亦都同佢哋…”
  • script: ”玲玲浸咗個熱水浴” vs audio: “玲玲浸咗個熱水涼”?
    -script: ”玲玲睇咗一會書然后瞓著咗” vs audio “玲玲睇咗一陣書…”?
    -script: “家俊亦好快睡著咗” vs audio “家俊都好快訓著咗”
    -script: “我睇咗一會書然后睡著咗” vs audio “我睇咗一陣書然后瞓著咗”
    -script: “家俊亦好快睡著咗” vs audio “家俊都好快訓著咗”
    -script:”佢哋係唔係喺夜晚6點食晚飯” vs audio “佢哋係唔係夜晚…”
    -script: “佢哋嘅仔亦同佢哋一齊睇電視” vs audio “佢哋嘅仔亦都同佢哋…”
    -script: “6) 玲玲浸咗個熱水浴” vs audio “玲玲浸咗個熱水涼”
    -script: “…佢浸咗個熱水浴” vs audio “…佢浸咗個熱水涼”
    -script: “8) 玲玲睇咗一會書” vs audio “玲玲睇咗一陣書”
    Mini Story 7/
    -Sentence view audio not matching sentences
  • script:詹姆斯唔鍾意一啲會議 vs audio 詹姆斯唔鍾意依啲會議
    -script 佢認為一啲會議好無聊 vs audio 佢認為依啲會議。。。
    -script 一啲客戶對詹姆斯好友好 vs audio 依啲客戶。。。
    -script 但有啲客户并唔友好 vs audio 但有一啲客户並唔友好
    -script: “唔係是係一啲客戶對詹姆斯好友好” vs audio “唔係係一啲客戶…”
    -script 佢每日盼着五點到來 vs audio 佢每日等住五點到來
    -script 我唔鐘意一啲會議 vs audio 我唔鐘意依啲會議
    -script 詹姆斯認為一啲會議好無聊 vs audio 我認為依啲會議好無聊
    -script 但有啲客户并唔友好 vs audio 但有一啲客户並唔友好
    -script 每日下午五點收工 vs audio 每日下走五點收工
    -script 我每日盼着五點到來 vs audio 我每日等住五點到來
    -script 4)詹姆斯認為一啲會議好無聊 vs audio 4)詹姆斯認為依啲會議好無聊
    -script 詹姆斯認為一啲會議好無聊嗎 vs audio 詹姆斯認為依啲會議。。。
    -script 係嘅詹姆斯認為一啲會議好無聊 vs audio 係嘅詹姆斯認為依啲會議好無聊
    -script 唔係,是係一啲客戶對詹姆斯好友好 vs 唔係,係一啲客戶對詹姆斯好友好
    -script 7)詹姆斯每日下午五點收工 vs 7)詹姆斯每日下晝五點收工
    -script 佢每日盼著五點到來 vs audio 佢每日等住五點到來
    Mini Story 8/
    -Sentence view audio not matching sentences
    -script “遗憾咁放底鞋離開 X”, X=missing character…
    -script ”係呀丽萨想買新鞋” vs audio “係嘅…”
    -script “係呀鞋店裡面有好多靓嘅鞋” vs audio “係嘅…”
    -script ”係呀丽萨試咗兩隊鞋…” vs audio “係嘅…”
    -script “嗰對藍色鞋子太紧啦” vs audio “嗰對藍色嘅鞋太紧啦
    -script “嗰對藍色鞋着得好舒服嗎” vs audio “嗰對藍色嘅鞋…”
    -script “嗰對藍色鞋著得唔舒服嗰對鞋太紧了” vs audio “嗰對藍色嘅鞋著得唔舒服嗰對鞋太紧啦”
    -script “嗰對黑色鞋着得好舒服” vs audio “嗰對黑色嘅鞋着得好舒服”
    -script “嗰對黑色鞋着得好舒服嗎” vs audio “嗰對黑色嘅鞋…”
    -script “係呀嗰對黑色鞋着得好舒服” vs audio “係嘅,嗰對黑色嘅鞋着得好舒服”
    -script “嗰對黑色鞋450美元“ vs audio “嗰對黑色嘅鞋450美金“
    -script 嗰對黑色鞋好貴嗎 vs audio 嗰對黑色嘅…
    -script 係呀嗰對黑鞋非常貴嗰對黑鞋要450美元 vs audio 係呀嗰對黑嘅鞋非常貴嗰對黑鞋要450美金
    -script 丽萨放低鞋離開咗鞋店 vs audio 丽萨放低咗鞋離開咗鞋舖
    Mini Story 9/
    -Sentence view some audio not matching sentences
    -script: 安迪拿咗一個藍同一輛購物車 vs audio 安迪攞咗一個藍同一架購物車
    -script 安迪有一個新女朋友嘅名叫萨拉 vs audio 安迪有一個新女朋友. 個名叫萨拉
    -script 係嘅安迪有咗一个個新女朋友, 佢嘅名叫萨拉 vs audio 係嘅安迪有咗一個新女朋友, 佢個名叫萨拉
    -script 安迪想為佢嘅女朋友做午餐嗎 vs audio 安迪想?pong?佢嘅女朋友做午餐嗎
    -script 係呀安迪到超市去買餸 vs audio 係嘅安迪去咗超市買餸
    -script 安迪睇咗睇冰柜裡面嘅魚但係佢冇買都. Should it be 買到?
    -script 係呀安迪睇咗睇冰柜裡面嘅魚但係佢冇買都 vs audio 係嘅安迪睇咗睇冰柜裡面嘅魚但係佢冇買到
    -script 後來離開超市返咗屋企啦 vs audio 後嚟離開超市返咗屋企啦
    -杰克好高興買咗呢個項鏈 vs 杰克好高興買咗呢一個項鏈
    Mini Story 10/
    -Sentence view some audio not matching sentences
    -script 佢要買梅根買一個禮物 vs audio 佢要為梅根買一個禮物
    -script 我有啲着急唔知道買乜禮物 vs audio 我啲着急唔知道買乜禮物
    -script 係呀杰克要去梅根嘅生日晚會 vs audio 係嘅杰克。。。
    -script 係呀梅根鍾意動物佢仲鍾意藍色 vs audio 係嘅梅根。。。
    -script 唔係佢見到一個銀項鏈而唔係一個戒指 vs audio 唔係佢睇到一個銀項鏈而唔係一個銀戒指
    -script 有一只紫色嘅貓喺項鏈上嗎 vs audio 有一隻紫色嘅貓喺項鏈上嗎
    -script 係呀杰克買咗嗰個項鏈因為佢覺得梅根會鐘意嗰個項鏈 vs audio 係嘅杰克。。。
    Mini Story 11/
    -Sentence view some audio not matching sentences
    -阿樂嘅屋企太細喇 vs 樂樂嘅公寓太細喇
    -想搬去一個新嘅屋企 vs 想搬去一間新公寓
    -佢喺網上睇屋嘅相 vs 佢喺網上睇公寓嘅相
    -有個單位好平 vs 一間公寓好平
    -但係又舊又乜都冇 vs 但係又舊又空
    -另一個單位有家俬 vs 另一間公寓有家俬
    -有一張床一部電視同廚櫃 vs 有一張床一部電視同灶枱
    -但係嗰個單位有啲貴 vs 但係呢間公寓有啲貴
    -阿杰決定儲多啲錢 vs 樂樂決定儲多啲錢
    -我嘅屋企太細喇 vs 我嘅公寓太細喇
    -我想搬去一個新嘅屋企 vs 我想搬去一間新公寓
    -我喺網上睇屋嘅相 vs 我喺網上睇公寓嘅相
    -有個單位好平 vs 一間公寓好平
    -但係又舊又乜都冇 vs 但係又舊又空
    -另一個單位有家俬 vs 另一間公寓有家俬
    -有一張床一部電視同廚櫃 vs 有一張床一部電視同灶枱
    -但係嗰個單位有啲貴 vs 但係呢間公寓有啲貴
    -阿杰嘅屋企好細 vs 樂樂嘅屋企好細
    -阿杰嘅屋企大唔大呀 vs 樂樂嘅公寓大嗎?
    -唔大佢嘅屋企唔大好細 vs 唔係佢嘅公寓唔係大,佢好細。
    -阿杰想搬去新嘅屋企 vs 樂樂想搬去一間新嘅公寓
    -阿杰係咪想搬屋呀? vs 樂樂想搬屋嗎?
    -係呀,阿杰想搬去新嘅屋企 vs 係嘅,樂樂想搬去一間新嘅公寓
    -阿杰喺網上睇屋嘅相 vs 樂樂喺網上睇公寓嘅相
    -阿杰係咪喺網上睇相呀 vs 樂樂喺網上睇相嗎?
    -係呀阿杰喺網上睇屋嘅相 vs 係嘅, 佢喺網上睇公寓嘅相
    -第一個單位好平 vs 第一間公寓平
    -第一個單位貴唔貴呀 vs 第一間單公寓貴嗎
    -第一個單位唔貴佢好平 vs 唔係,第一間公寓唔貴,佢好平
    -第一個單位又舊又乜都冇 vs 第一間公寓又舊又空
    -第一個單位舊唔舊呀? vs 第一間公寓舊嗎?
    -舊呀第一個單位又舊又乜都冇 vs 係嘅,第一間公寓又舊又空
    -另一個單位有家俬有一張床一部電視同廚櫃 vs另一間公寓有家俬有一張床一部電視同灶枱
    -另一個單位有冇家俬呀 vs 另一間公寓有家俬嗎
    -有呀,另一個單位有家俬有一張床一部電視同廚櫃 vs 係嘅,另一間公寓有家俬有一張床一部電視同灶枱
    -另一個單位比較貴 vs 另一間公寓貴
    -另一個單位平唔平呀 vs 另一間公寓平嗎?
    -唔平,另一個單位唔平佢比較貴 vs 唔係,另一間公寓唔係平,佢貴。
    -阿杰決定儲多啲錢 vs 樂樂決定儲多啲錢
    -阿杰係咪決定儲多啲錢呀 vs 樂樂決定儲多啲錢嗎
    -係呀,阿杰決定儲多啲錢 vs 係嘅,阿杰決定儲多啲錢
    -阿杰出年可以搬屋 vs 樂樂出年可以搬屋
    -阿杰今年可唔可以搬屋呀 vs 樂樂今年可以搬屋嗎
    -唔可以,阿杰今年唔可以搬屋佢可以出年搬屋 vs 唔係,樂樂今年搬唔到屋,佢可以出年搬屋

The Mini stories are great and a fantastic part of the product. Love the way they are set up. Nonetheless I understood them to be part of the product. A beginner with no foundation in the language has no way of discovering issues and less likely to find their own alternative reliable material so are more likely to be relying more on the Mini Stories. The Mini Stories would be critical to them as a starting point and should be as accurate as possible, both to ensure the best early learning outcomes but also as it reflects on the brand and perception of quality of Lingq to the learner.

“For some reason I can’t find the significant issues with the mini stories. “ “I’m certainly no native speaker, but llearner checked the first 10 lessons as well. “
Both yourself and llearner have both posted a number of issues. You also know that there’s issues with the sentence audio. You don’t think the issues noted are significant but to new learners they can pose a significant stumbling block. I find them jarring. As a new learner you have no way of knowing if you are making a mistake or if the mistake is with the lesson. Without other resources like access to a native speaker it will be very difficult to find out where the issue lies or what it is in some cases.

“So someone from LingQ will have to go though them one by one, this is a tedious and time consuming task.”
It’s a feature with a button users generally will expect to work. I do this for my own imports. I’m not sure I understand the issue. If volunteers could be conscripted for the task ok…if not…don’t many jobs at work involve tedious and time consuming things :wink: ?

“This seems to be more related to the word splitter and not to the mini stories per se. This is a known problem of LingQ in Chinese and Japanese, I don’t think this will ever be solved.”
This seems to me to be the same issue as the audio sync. A tedious and time consuming task. I’m doing this for all my own imports. I don’t just accept what the splitter gives me. We can adjust our own imports but can’t adjust the Mini Stories as far as I understand so they need to be as accurate as possible. It doesn’t take long for me to skim through a story, or lesson and adjust the grouping generally…and when it’s that important for new learners and quality perception I would think it is a worthwhile project.

“These stories were created by volunteers, I don’t think there was any involvement from LingQ.”
I’m new to Lingq so quite possible I’m mistaken and have misunderstood but I thought all languages had Mini-Stories as an incentive and starting point for new learners, and thought although volunteers were used in their generation, that they had a consistent format and structure outlined by Lingq as well as (I imagined) volunteers managed, quality vetted, and technically maintained by Lingq as well. If the Mini Stories are no different than any other user-generated material and not part of the product I’m thinking of this all wrong and apologise for misunderstanding.

“the whole dialogue being different: Can you give an example of this? I just went though the first 25 stories and can’t seem to find this.”
I’m struggling to not see how lesson 11 audio and text are completely different. The text is not a transcript of the words. The meaning may be the same, but a new learner is expecting the transcript and words to be the same.