Percent New Words is not Coverage

I apologize for bringing up a feature request which surely already exists on this forum somewhere. I oddly enough couldn’t find any threads discussing it.

I like that LingQ lets me see sources by % New Words. That’s interesting. It isn’t very useful, but it is interesting. The reason it is not useful is that it varies greatly depending on the length of the resource. This is why a course and a lesson almost never have the same % New number.

A coverage metric would tell us if we could read something or not.

What I would like to see in addition is a % Coverage metric which tells me how many known words there are weighted against the number of total words present. For instance if I know 8000 out of 16,000 unique words in a 300,000 word novel, it is 50% New Words, but I don’t know if it is comprehensible to me or not. I’d only know that if I knew what percentage of that 300,000 is composed of my 8000 known words. If the 8000 unknown words are infrequent and only occur three times each, then they are only ((3x8000)/300000)*100 or 8% of the total book. So I’d have 92% coverage, which would suggest that it would probably be worth trying to read it. If the 8000 unknown words occur 15 times each, then I’ve only got 60% coverage for the resource and would probably struggle to read it.

I’m not suggesting to get rid of the less useful metric that lingq standardizes on. I’m just saying it’d be a lot more informative if the little stats block for a course also gave a coverage metric.

I presume it would not be that hard to add one more line to this information presented here if the database is already doing fancy enough math to track these numbers. Why is there even a bar for Known Words and LingQs anyway if they’re not percentages? I mean… this data is really meaningless. It means that somewhere between five (7129/125863) and ninety-five ((125863-7439)/125863) percent of the words in the book are familiar to me. In other words, there’s absolutely no way to know if I can read it or not.


Interesting. I’d like to be able to monitor my chosen statistics in the heading; not just days. Known over total words would be an improvement.

Thanks for your feedback and suggestion. We appreciate it. I’ll forward this to our team and we’ll see what we can do in the upcoming updates.

It would be really great if in future, LingQ could suggest resources that are just at the right level for you, i.e. the n+1 Steve keeps on talking about. That would really help with learning. Would that be feasible at all?

You’re most likely to get to the n+1 sentences if you know 95% of the words that show up in a given resource. But as LingQ is now, it is impossible to know if that is the case or not just by looking at the percent new value. I think doing the math for coverage on each course or lesson would make that more obvious.