Shortly after I started using LingQ, I liked to regularly look at Steve’s Known words in his different languages, to see what kind of milestones he has hit, and maybe give me an idea of some long term goals I should have. It looked to me that Steve likes to stop around 25K-30K known words with his languages. (Excluding Slavic languages and some others) So I came up with the idea that 30K known words on LingQ = fluency in reading. (My idea of fluency means understanding nearly everything without the help of dictionary) So this has always been my dream number! AHHH Yess 30K! The number I will strive for!
Skip many many years later, im nearing 30K in German and I feel no where near my definition of Fluency, so im a little sad not gonna lie. It seems that 50K is more of the number that represents fluency to me.
Which makes me curious why Steve and maybe many others stop at around 20K - 30K? I feel like it doesn’t bring you to the level of fluency, but just short. Reaching 30K. and then dropping that language just to pick up a new language from scratch right before you make a big breakthrough is just crazy to me!
What do you guys think of 30K known words? Is it fluent enough for you guys? how does it compare to 40K and 50K in your minds?
Also it seem that you really don’t need anymore than 50K known words if you’re not interested in reading novels.
Do you find you are seeing a lot of yellow words still? encountering a lot of blue still? Are you forgetting known words? I’m not at 30K so I can’t say for sure whether you should feel a little more fluent than what you do. That’s essentially double where I’m at though and I feel like I’ll be pretty well off doubling the words known from where I’m at, but who knows. I think German will require a little more than say a Spanish or a similar language. Stopping at 25K may be just fine for a language like that. German might be closer to 40k to 50k. I’ll be interested to hear what others say.
I feel similarly. I had the initial goal of reaching Advanced 2 in French (around 32K). I am currently at around 35K and I agree that I need to push my target up to the 40-50K range to really feel comfortable.
I can read most news articles but novels are tougher. I can get the main point of a novel without looking things up but I definitely miss details. Similarly, I can get the main events of TV shows with French subtitles but definitely can’t understand without them. As for conversations with my friend, I am getting better at holding basic conversations but get lost when the topics become more specialized. I’m curious to hear how other people in the 30K range are feeling.
i Think steve was already fluent in like 9 language when he created LingQ that he pretty much dabbles in languages anymore with fluency not being the goal. He will even say that some of the languages he’s learned in the past he isnt fluent anymore. I did read in another thread on here that 50k known word in German is close to fluency. Looks to be a good goal to shoot for.
Almost 2 years ago when I started German here at LingQ my expectations were the same - 20-30K known words and I’m a fluent listener. I thought such number would allow me to read and listen everyday content with no trouble at all - things like Wikipedia, internet forums, news bulletins and movies.
30K didn’t bring it to me. Although I could read and listen, it wasn’t comfortable at all. Going through the news was a pain.
Only now, with 60K, I can just listen to Tagesschau (German news) and enjoy it. But still, today’s 20min Tagesschau news broadcast contains 65 (6%) unknown words! And often I know all the words but still can’t grasp the meaning behind it.
I’m very optimistic about reaching my language goals, the progress that is possible via LingQ never stops to amaze me. Though I don’t think I’m ever going to be satisfied with the results.
In my experience, German grammar is much harder to read and requires you to know so many grammar rules and words, in comparison to Spanish and French. Even though I have fewer words in French, I can understand it way more than I can understand any German text. Moreover, lingq’s translate system doesn’t work well with long German sentences either which is another factor.
Interesting post. During lockdown (which is when I started using LingQ/Input), I’ve been obsessing over my known-word count; like you and many others I made the connection between increasing known-word count and general comprehension. But it’s important to remember the number we speak of is only the product of input: it is the input itself which determines (or at least is heavily correlated) your comprehension.
If you take your German stats and my Greek for example, my known-word count is 10-12K lower than yours however, my input stats (reading/LingQ’ing/listening) are a lot higher. I’m approaching a million words read and cannot comprehend news broadcasts, I can just about grasp broad themes/scenes in children’s books. And I should say I spend a lot of time watching films, listening to podcasts off LingQ too. I also attended classes for 3-4months before I even made a LingQ account (so I hit the ground running).
The known-word count is probably best used to measure monthly progress provided you don’t change the way you count words. I suggest your target should not be to hit a certain known-word target but to set daily targets for how much exposure per day/week in terms of hours. I think 1-million words of reading is a better milestone for learners than X amount of know-words - since it’s measuring your input and not the product of.
Anyway, awesome post. I saw a post I can no longer find about 10K/20K/30K milestones, I wish there were more of them
I just wanted to chime in about the known word count, especially to any LingQ newbies reading this. When I first started out here, I was improving my Portuguese and reading tons of imported articles. I created LingQs for those vocabulary words I didn’t know and then clicked to the next screen and eventually to the Complete Lesson button.
At that time, I didn’t appreciate that all the proper names (first and last names of people + geographic locations) as well as foreign text (usually English) were being added to my known word count, badly distorting it. After a couple of months, I realized my error but it was too late. My KWC is over 33K now, but the actual number must be several thousand lower. I’ll never know for sure.
When I started a brand new language (Greek), I was extremely careful to Ignore (using the X shortcut key) any proper names or foreign words and now I know my KWC is accurate. If you’re a nerd for statistics, like I am, it’s worth the extra effort to exclude these words that do not belong in your count.
Have you done some analysis on the frequency of those extra words? Several thousands out of 33K sounds a lot to me.
I don’t know any way of seeing a list of your known words, so how would I analyze or even identify them? Several thousand does not seem unrealistic to me in this case. I read a lot of articles about books, films, history, current events, etc. and they were loaded with proper names.
All saved words are listed in the database, aren’t they. Alternatively, one could take a sample of texts and count how many words are proper names etc. Might be too much effort though.
I don’t personally worry about this though. Even 10% inaccuracy is pretty good in my field.
I’m experiencing that a lot as I read novels. Most of the words I know, but they are ordered in some “intricate” ways that don’t seem to make sense on first read. After translating the whole sentence it often makes sense, but to get some of those things to click can be difficult.
But it’s quite possible to read and enjoy novels well before having 50K Known Words in LingQ, isn’t it? I recently read my first novel in Spanish (El niño con el pijama de rayas by John Boyne), even if I had less than 5K known words back then. The question is then how easy it has to be.
Fluency is really a byproduct of speaking practice and technically, I think one could be fluent with 30K known words with a lot speaking practice.
But conversely, as a LingQ user, your primary engagement with the language is probably reading and listening so to feel like you’re “fluent” in those activities you will end up needing more more words than if you were just hanging out and chatting with your friends in Germany.
So, yeah, 50K known words is a good benchmark for unassisted reading and listening.
An interesting thing about this is really only LingQ users are able to use this metric. Most language learners, engage with all these activities in various levels without having any idea how many words they might actually know.
But from my experience, I felt like I could speak English fluently well before I was able to read a book unassisted. But doing the reading listening method, one can end up on the opposite side, having a massive passive vocab, but not enough active practice to be “fluent.”
I’m glad you’re bringing up the proper noun thing. So many people don’t realize how this can inflate your known word count over the course of a few books.
In my opinion; the “known words” count is rather “useless”. For several reasons:
- Take a simple verb like “gehen” in German or “aller” in French. If you conjugate these verbs in the present tense, you get:
- German: Ich gehe, du gehst, er / sie / es geht, wir gehen, ihr geht, sie gehen
- French: je vais, tu vas, il / elle / on va, nous allons, vous allez, ils / elles vont
And there are “many” more variations when you consider future, perfect / imperfect, and subjunctive / Konjunktiv forms! So a single infinitive can have countless variations, all counted as different words.
- It’s the same with singular and plural forms that are counted as different words. For example:
- the table = der Tisch / la table, plural: the tables = die Tische / les tables
- the wall = die Wand / le mur, plural: die Wände / les murs
Then, as PerpetualTraveler correctly noted, you have proper names, city names, etc., or words from other languages included.
Apart from that, focusing on single words is deeply flawed, because native speakers don’t build their sentences from single words, but they are heavy users of tens of thousands of highly conventional word groups, i.e. “collocations”.
So, for example, it doesn’t make sense to learn a simple word equation à la “erhalten / bekommen” (German) = “get”, when you have countless collocations with “get” in English:
- Get a call
- Get a chance
- Get a clue
- Get a cold
- Get a degree/ a diploma
- Get a job
- Get a joke
- Get a letter (receive)
- Get a shock
- Get a splitting headache
- Get a tan
See: 48 Useful Collocations with GET with Examples • 7ESL
Not to mention phrasal verbs in this context with very different meanings beyond “get = erhalten / bekommen” like: “get in, get out, get off, get down, etc.” (German equivalent with “gehen”: prefix + verb constructions such as “aus, auseinander, hinein, an, ab, etc. + gehen”).
To handle all this, LingQ would need a much more sophisticated implementation, esp. of the string tokenization process, so that the “known words metric” is more useful. But, this implementation would also be “much” harder.
Therefore, I concur with RJDavies: