Lingq thinks there are way too many words in Hebrew thanks to prefixes

I’ve noticed that whenever a Hebrew word has a prefix like ב or ה or ל, LingQ mistakenly thinks it’s a totally new word. This makes the “new word count” and “known word count” completely off. I really like the idea of knowing how many words I know in Hebrew, so it’s disappointing that this doesn’t word, and my numbers get crazily overflated. Is there a way to fix this? Thanks.

Actually it’s the same for other languages too. “Not a bug, but a feature.”

In some languages, words with multiple forms might not be as recognizable as just prefixing the word with a letter. For example, in English, the verb “to be” takes many forms: am, is, are, was, were, being, been,… It’s really not intuitive at all. Rather than automatically marking all forms of a familiar verb as “known,” LingQ allows me to mark some forms as known and others as unknown. So that’s the logic behind it.

As long as you can see that your word count is going up, something good is probably happening. :slight_smile:

Hope that helps.


Thanks for the reply, but you don’t understand: prefixes work differently in Hebrew. It’s not just a different form of the verb (Hebrew has those too, that’s not what I’m talking about). Let me explain:

The Hebrew word for “the” is one of these prefixes: “ה”. That means every time you need to use the word “the,” you have to stick “ה” onto the next word. So nearly every noun in the Hebrew language gets “ה” and these other prefixes stuck on it at some point. That means LingQ counts “the chair,” “the table,” “the person,” and “the sky” all as distinct words from “chair,” “table,” “person,” and “sky.”

The other Hebrew prefixes are the words for “to,” “in,” “from,” and other very, very common words. So when you count these all as separate words, you multiply the Hebrew language, like, five times over. The “new word estimates” are all completely wrong as a result.

“Known” words are based on words, not terms. So, if you split up the “word” into two words in the “Edit Sentence” mode, and thus, making into two words, then you should prevent what you’re experiencing.

In “Read Sentence” mode, you can still combine the words as “lingq phrases/terms” so that you can track those terms, and these new lingqs won’t affect your overall “known words”

As you can imagine, the side effect of this step is that you’ll spend a lot of time editing sentences. You’ll also find that some (if not all) of the LingQ lessons can’t be modify. I don’t know of any other method to easily correct it. You could use other tools and reimport also, but that could be equally painful / slow.

In Italian, the following are separate word variants:

  • amica (female friend)
  • l’amica (the female friend)
  • dell’amica (of the female friend)
  • all’amica (to the female friend)
  • dall’amica (from the female friend)

This is not an issue. LingQ records word variants, not head words nor word families. The statistics are designed for motivation. You are just meant to see an increasing number over the months and years.

If you wish to know how many word families or head words you know, either go through a frequency list and mark the words manually or use one of the vocabulary estimators.

I’d say this is more a side benefit. The real reason would be because it’s easier to implement from a software perspective.


Purpose of the known words isn’t to count words rather than track your familiarity with the language. Problem with your example is that were would you put the limit? For example Spanish can have reflexive verbs that are conjucation of other verbs. I suppose this applies to every language. At some point you will have a exception to the rule and the word you would like to exclude means also something else that by your definition should be included. It’s easier just to count all and judge progress based on that. That’s why lingqs known word targets are higher than they would be if you counted just unconjucated forms.

LingQ classes word forms as “words”. My advice: don’t worry about it. As long as you’re learning, the number of words you’ve accumulated doesn’t matter.




LingQ が合計単語数を増やすことで不正行為を行っていると信じている人もいます。このような人は、学習した単語の数を重視しすぎます。しかし、合計数は重要ではありません。

As people said that’s just how Lingq works. It’s partly because it’s easier to implement it as an algorithm.

It’s completely the same with Arabic, which also stick articles. And in Korean, we have nouns with sticked particles recognised as separate forms.
It’s even worse in Turkish which is highly agglutinative language, so it have an infinite number of word forms.

I agree, that it may have sense to “merge” some easily recognisable forms into one lemma, but that way we will have to decide which word forms are “evident” and which are not. When you get better with the language, you also get better recognising different forms and intentional misspellings.

Word count means nothing. Just the more the better. Also, Lingq somewhat takes these considerations into account, when counting the needed known words count for getting the next level. In a language with many forms, you need more known words to reach the same level.

There’re some apps who take the different approach, trying to lemmatise word forms. You may like that more. While this reduces the number of clicks and gives more precise word count, it also has its own disadvantages.

Oh, I didn’t know that about Hebrew. You’re right, what you’re describing is different from verb conjugations.
Something very similar happens in Korean, though, at the ends of verbs. There are so many different things that can be tacked onto the ends of verbs. It’s never-ending. :slight_smile:



LingQ の背後にある考え方は、人々は読み続ける、つまり読めば読むほどより多くのことを学べるということです。したがって、連続記録は読書のみに依存します。語彙機能を使用すると、すでに読んだ単語を練習できます。

I just thought of an idea. Maybe you can export vocabulary to Excel and then alphabetize it and eliminate all words starting with the, in, etc and you’ll be left with a more accurate vocab count. =number of lines in table.

It’s still probably not exhaustive because if you’re like most people you probably know a number of words that you haven’t encountered yet on LingQ.


I agree with the people who say don’t worry about it! Many languages have something like this, though not quite to the same extent. For example, like Hebrew, Spanish can stick a pronoun onto the end of a word, and that will count as a different word. In French, of (d’) is also a common prefix stuck to the beginning of the word, and Lingq will count this as a different word. It’s fine! That’s why Lingq word counts is only relative. If they are getting higher, that means you are using the language. That’s why, according to some people, advanced is really like 70,000 words in Lingq. The new numbers are better at estimating, however.