Word splitting is very inconsistent in Chinese as well.
For example, every combination of [number][classifier][word] shows up as a separate “word” in LingQ, even within the same lesson.
This means that this simple “one person” phrase generates six LingQs:
- 一个人
- 一
- 个
- 人
- 一个
- 个人
Then, changing from “one person” to “two people” gets you another 3 LingQs. Changing that to “three people” gets you yet another 3 LingQs. And so on. The same thing happens with “some people”, “more people”, “fewer people”, etc.
This leads to a massive explosion of unknown words (and known words) in LingQ that correspond to a relatively very few real words in the language.
I’ve accepted that this is just something I have to deal with in a language that does not use spaces between words.
That said, if LingQ could recognize even that apparently simple pattern above it would do a lot of good.