"Knowns words" statistic with verb conjugations consolidated or contractions

I was wondering if the LingQ team has considered customizing what it means for a word to be a “word”? For example, I feel that my known word count is artificially high due to two reasons (my target language is French): (1) verb conjugations counting as separate words and (2) contractions. But note, these also affect my LingQs in general.

I think the first problem is tricky to solve completely, but at least a better (albeit not totally accurate) estimation can still be achieved. The major obstacles I see for romance languages and English (which are the only languages I’m familiar with) are certain spellings of conjugations also being nouns (for example in French, the word “lit” is both the present 3rd person singular conjugation of “lire”, but also the world for “bed”). A precondition before adding a word to the statistic should do the trick (“if not a conjugation and not a noun”), but it also assumes there exists a lookup functionality that recognizes conjugations and nouns simultaneously (Wordreference already does this, so perhaps their API could be used?). Note, I’m not advocating here to treat conjugations differently within the LingQ (the “yellow words”) system.

Concerning the second problem, let me clarify with an example. Given this phrase: “d’être”, LingQ parses that as one word. It would be nice if LingQ could semantically parse that and give me the option to only recognize the non-contracted word (on a word by word basis, because there are some contractions that have a unique semantic meaning apart from the literal translation of the two contracted words read together, such as the french term for okay: “d’accord”). This is probably an impossibly hard problem given the current state of the natural language processing field, but there does exist a simple alternative: let me (the user) highlight sub-portions of a word that contain a contraction (or maybe more generally, of any word). For example, as the system works right now, if I try to highlight the “être” of “d’être” the system doesn’t respond; it doesn’t work and keeps the original highlighting of “d’être” as is.

Anyways, this post is mainly meant to just (a) highlight my major pain points in using LingQ after about a month of use, (b) help the LingQ in whichever way possible, and (c) start an open discussion; perhaps the LingQ team has already considered these problems and decided on the currently functionality for good, but non-obvious reasons.


@jhaberstro - Thanks for the feedback. The first issue you report is one that we don’t have any immediate plans to focus on. It is something that is very complicated to resolve, and there is always the lingering question of whether the result of any efforts here will produce something that is definitively better than what we have now, or something that is only better in some ways but worse in others. In the end, many words have different meanings and as a learner becomes more familiar with the language they will develop a better sense for this, understanding better what “lit” means based on the context.

Regarding the second issue, we do allow users to add words directly from the Vocabulary page. Perhaps we can look at other ways to make this more prominent while studying a lesson itself.