What do experienced Lingqers think is known or read word minimum for the fabled B2/C1 shangri la?

Honestly, I think self-study is better. Surrounding yourself with native speakers is, in my opinion, only efficient to develop your speaking skills. The rest, self-study is more efficient.

Well, we need to distinguish a few things. Immersion doesn’t mean only going to school in a foreign country.
I did self-study while I was in any different country.
But for sure, immersion by itself doesn’t mean anything and I always take in consideration immigrants. They can spend their entire life in a foreign country and have a poor knowledge of the language.

So, immersion per se doesn’t mean anything.

But if you self-study (with the tools we have now) + I can go to the bar right now and speak the target language, it’s a BIG HUGE difference.
I will improve my speaking (yes) but I also have the possibility to use straight away what I’m learning, reading, studying, etc.

I can also ask people everything and they will answer. About everything. I can ask about football, history, names, slang, etc. All for free, and I can even have lots of different opinions on the same stuff. They will talk and share, I will learn!

I will also improve my listening which will make easier to my brain to focus on other things.

If I work in the target country I will also improve my writing if I have a job that requires me to write emails or stuff.

My brain won’t have any choice!
I will have so much less distractions and I won’t have to use my own native language all the time, like I do here.

Plus I will have a lot of fun in experiencing a different culture, food, habits and so on. You have a lot of different learning moments that you can’t compare it at home. I can’t!

If I could choose today I would do the same things I’m doing now with LingQ + the target country. I will definitely fuse together the best of these learning methods.

@Anxxos Thanks for the FSI comments.

I do wonder, however, whether 2 x 720 hours (classroom + homework) of, say, FSI French really gets one to B2/C1. Spread out over 50 weeks @ 40 hours/week that’s ~6 hours /weekday.

I’m sure it’s a good start and an FSI grad could function in French, but B2/C1?

I’ve looked at the FSI materials and they are the old canned dialogs, vocab lists, drills etc. If I’m to believe Krashen and Kaufmann this isn’t the most efficient way to learn.

Then again, if one is getting paid and one’s career is on the line, perhaps motivation would make a big difference.

@jt23 B2 isn’t that high. B2 means passing a B2 test. I passed the reading and listening parts the other month. I did it with 1.5M words read and ~500 hours of listening (mostly LingQ, some Netflix). It’s easier to pass the associated test than to reach the qualitative description of the CEFR level.

This is slightly off topic, but I know nfera that you have focused primarily on television/movies for your listening, with the idea that you want exposure to more conversational speech. And then you’ll approach literature naturally as you pick up the vocabulary to make literature accessible to you at a known words % that you are comfortable with.

I did things the opposite way. I like reading more than watching and I have a higher tolerance for seeing unknown words and still enjoying the reading experience than I do for listening to audio that I can’t parse word-for-word. So while I’ve been pushing my reading right up to the threshold of authentic literary fiction I’ve been more gradually stepping up the difficulty of my listening: podcasts for italian learners → documentary films → lectures and talking head podcasts for native speakers. In the same way you are letting your known word % drop naturally until literature is accessible, I have a % un-parse-able speech threshold that I’m sensitive to and have been letting drop naturally. In any case, I’ve decided that now is the time to push myself into more television/movie listening, but I’m finding it still challenging, because even with whisper I’m not able to get a perfect phonetic transcript of what I’m hearing.

Can you describe your experience with this, and your current level of comprehension? Are you at the point where your grasp of conversational italian is strong enough that it never takes away from your watching experience, or are you at the point where you are self-aware of parsing every word and you can clearly pull out words and phrasing that you don’t know (this is where I am with documentaries)? What is your experience with regional pronunciations? All of the native-directed television/movies I’ve found have used regional pronunciation, often to identify a character or signal something about the story. How do you handle this? Do you put the effort into to repeated listening of moments of regional pronunciation? Do you just ignore it and keep listening, and have you found over time that you’ve gotten better at parsing these pronunciations?

My expectation is that I just have to deal with the ambiguity, and keep myself in the I-enjoy-this-enough-to-keep-going zone, and listen for another couple hundred hours, and I’m willing to do that, but if you have any more specific advice I’d be interested in hearing it.

@GMelillio It’s interesting that you took the route of literature first. As much as I love reading, I do not like clicking on a kajillion blue words to get through a single page. Going by your ratio between number of lingQs and Known Words, I’m guessing you had a bit of experience of Italian before LingQ, right?

I have watched a lot of Netflix, but also imported a lot of YouTube into LingQ. Generally nearly all my listening these days is listening while reading (to YouTube audio on LingQ) or watching Netflix with subtitles. These days I consider my LingQ listening while reading to YouTube is my ‘intensive’ reading (where I look up words in the dictionary), while Netflix with subs is my ‘extensive’ reading (as I don’t look up words in the dictionary). The reason I use subs on Netflix is because I consider it ‘free’ reading practice. The reason why I listen to the YouTube audio while reading the transcript is for ‘free’ listening practice (plus listening increasing my reading speed significantly). Eventually, I’ll practise them separately, as one skill can be a crutch for the other, but, currently, doing them together is working great and I’m still seeing strong gains in listening, reading, vocabulary, and grammar all together.

With regard to the standard Italian, lightly accented Italian, strongly accented Italian/half dialect, completely in dialect continuum question, I can parse most words, if they are speaking standard Italian or near to it. If they are speaking dialect, they might as well be speaking Spanish. But there are a few dialects, which are close to standard Italian and those I understand a little more of. From my understanding, many Italians understand Roman dialect, due to its close proximinity to standard Italian, but mainly due to the availability of TV series/movies in Roman dialect. I don’t relisten/rewatch to content these days, with the only exception being ‘Strappare lungo i bordi’, as it’s a great mini-series and to improve my understanding of Roman dialect (first was with subtitles and then two times without).

Direct advice would be:

  1. First focus on standard Italian. Most content is in standard Italian. When it’s in dialect, most Italians would be having subtitles on anyways, so it’s no issue (depending on if/which dialect they know).
  2. Read while listening or watching Netflix with subtitles is a great way to understand the content and improve your listening skills
  3. Netflix has quality subtitles and a subscription is worth the price (you can also import the transcripts into LingQ to add them to words read for the stats)
  4. Alternatively, there are many great human-subtitled YouTube videos
  5. TV shows or entire YouTube channels are easier to understand, as, over time, you learn the characters’/hosts’ accents. Start with TV series before movies.
  6. Here are some TV show/YouTube channel recommendations: Login - LingQ

@nfera I enjoy and benefit from reading your comments. Not to disagree but to supplement…

Once I got some basic French vocabulary/grammar sorted, I found LingQ, fired it up. then pointed it at French fiction I had read in English – St. Exupery, Camus, Simenon, Reage, and JK Rowling (translated into French). (I am a literary person.)

Sometimes I was looking at 60% Unknown Words. I knew I was overreaching, but I figured, why not? I did eventually settle into a more sensible order, starting with “The Little Prince” and now the first “Harry Potter.”

Even Harry Potter started at 40% Unknown. But I’m now 60% through and the remaining chapters are all 15-25% Unknown, which seems like a breeze to me now. (It does help that there are so many cognates in French.)

I have no beef with how others scale their language mountains. This is how I’m climbing Mt. French and I’m enjoying it.

@nfera - for some reason I can’t reply to your reply, maybe because of how nested the replies already are. The short answer is: yes, I have prior experience with Italian.

I studied Italian in high school and placed into the final semester of a four semester language requirement in college (this is 20 years ago!). In the meantime I’ve learned some Ancient Greek, Latin, and German - and what I picked up from studying these language is a desire to do this anti-Linq-philosophy-thing where I memorize all the patterns of verb inflection or noun declension before I look at any content (I don’t care much about syntax, I just want to be able to recognize whatever the morphology of the language is telling me about the function of particular words), so I did a refresher on Italian before I started up on Lingq a year ago.

I also use the lingqing a little differently. I don’t mind re-reading sentences a few times before deciding to click on a word. I basically give myself time to make sense out of what I’m reading and try not to look up a word unless I absolutely have to. I find that a significant amount of the time I can make sense out of the text on my own if I allow myself the opportunity to recruit the whole battery of linguistic information I’ve got stored up in my brain, and it’s fun to feel a sentence sort of “click into place.” So I know there are some words that I just never bothered to linq because I grasped the context well enough the first time, and then those words kind of became more truly “known” words without going through the linqing and promoting process.

That, plus the fact that I study the linqs from a book pretty intensively once I finish reading. Read and lingq - study all linqs – re-read while listening if possible.

So I probably had the real core high frequency vocabulary of italian already loaded in before I started and I now have very few word-roots or verb stems lingqed more than once.

Anyway, that’s all rambling. I’ve noticed that people who like learning languages seem to also really like reflecting on how they learn languages.

Thanks for the advice. The page of resources is extremely helpful. I’ll also check out Netflix. A few months ago I tried to start watching tv on rai.tv but the subtitles were not close enough to what I was hearing to satisfy my desire to have an exact transcription. When I started watching TV series I would watch while saving the audio, then I run the audio through whisper and get subtitles much better than are available on RAI.

@Gmelillo Your way reminds me of how Alexander Arguelles likes the grammar intricacies as well. If you do know all the conjugation rules, then you only need to learn the root word (exceptions aside).

The time that you give yourself to mull over the definition of the word is, yeah, only possible when reading, otherwise you are stopping the movie/podcast every few seconds. The learning through reading literature is definitely the route you have to take for Ancient Greek and Latin. I can see where you got the technique from.

With your listening hours, I think you are close to me anyways. You don’t lack the vocabulary or the grammar, just the parsing of the words while listening (and at speed). I think you are pretty close. Every 100 hours of listening is really a step up the ladder, as some others like @chytran have mentioned.

@jt23 Yeah, for sure. You can definitely do intensive reading with a high percentage of New Words. That’s what I had to do on LingQ, when I was a beginner (as I started Italian on LingQ). What I found worked was to read + lingQ New Words on the first read through of a story/lesson, then re-read it while listening to the audio, RWL again, and then listen to the audio many times over. It really did drill in the vocabulary and build up my listening comprehension. I gotta say that I’m glad I’m passed this stage though.

These days, as an upper intermediate, I really do prefer reading while listening to content with a lower % New Words, as you can focus more on the content itself instead of trying to understand WTF is going on. xD My transition into fiction books is reading while listening to a game of D&D published on YouTube and imported into LingQ. Perhaps this is even more entertaining, because the characters have different voices, they don’t know what’s going on, it’s unscripted, and there’s still a narrator, describing some scenes with descriptive fantasy words. But the first book will still be challenging, I know, because in Italian, like in German, they use a literary tense, which is only very rarely used in spoken language, so I haven’t really encountered it much.

@nfera The Netflix suggestion was a good one. The subtitles are more accurate than the RAI subtitles, but then the import function didn’t work for Linq, but that was a blessing in disguise because there is this “audio description” feature where a narrator talks about everything happening on screen when there is no dialogue.

So now I have ling lessons of a Netflix show where I can listen to the audio but don’t have to adjust my listening hours down for all the dead space with no speaking in the television episode. It’s just an unbroken string of linguistic audio to listen to.

@nfera What I particularly like in your comment is the notion that one goes through stages in language learning. What worked at one stage will not necessarily work best at another.

I’m already adjusting my style from mostly vocab acquisition to a more detailed study/practice of grammar, structure, listening comprehension and pronunciation.

This slows me down. My daily LingQ point score has dropped, but that’s OK – as long as I can maintain my streak!

I’m fascinated to read here of all the different strategies people invent for their language journeys.

PS In French I understand “passé simple” is a literary tense, as well as the “antérieur” tenses.

@jt23 Yeah, exactly. You realise that your biggest weakest is X, so you adjust to your focus, like what you are doing.

@GMelillio I was wondering what you were talking about with ‘audio description’, but now I see. For some reason, there is a much larger availability of subtitles/audios on the browser than on the Netflix mobile app.

For what it’s worth, I record my listening stats 1:1 for watching movies. I don’t depreciate them whatsoever.

Also, with Netflix on the browser, if you use the Chrome extension Language Reactor, it allows you to easily repeat sentences with a hotkey. Furthermore, you can skip to the next sentence said (i.e. subtitle timestamp) if you don’t want all the non-talking space.

For Italian, I took an official B1 exam in person in Seattle almost exactly one year ago and passed easily. At that time, my LingQ stats were obviously much lower (I had about 20k known words, 25 hours of speaking, and 1 million words read, including reading done before I started with LingQ). With that in mind, it doesn’t seem unreasonable to assume that one could be very proficient in a language at 2-3 millions words read, 40-50 hours of speaking, and 35-40k known words. This also assumes that B2 is the threshold for being proficient/fluent, but obviously some people may have lesser or greater goals.

Now, have I reached this point? I think that’s a hard maybe.

After about 2 million words read (this includes time spent reading before I found LingQ), I can read books unassisted now, though I’ll still come across unknown words. However, 95% of the time, that unknown word doesn’t impact my ability to understand the passage.

I don’t practice writing through any kind of concentrated practice, though at the B1 exam 1 year ago, I obviously passed the writing portion at that time.

After ~300 hours of listening, I can easily watch youtube videos, listen to podcasts, watch/listen to the news, watch most movies/tv shows without subtitles, etc. However, some movies/shows are much harder than others due to dialects/accents. So, I’ve still got a lot of practice to do there (probably minimum 100 more hours).

Finally, I have conversed many times with language partners in only Italian for well over an hour, and apart from pausing every once in a while to think of a word, or sometimes asking for help thinking of a word, it’s been very fluid.

Basically, I think I’m close to that point that you describe wanting to reach, but not quite. So, my statistics may be a good estimate on the very low end of what’s required, but obviously there are many other extremely active users on LingQ whose statistics may tell a completely different story.

To answer the OP’s question, plus adding a little of my own info, I started to see this area in Spanish around

–1.75 million words read
–30,000 known words
–Just under 1,000 overall hours of Spanish. In other words, pretty much dead on FSI/DLI estimation of hours. 23-25 weeks of learning 40 hours per week. The hours typically referenced are merely “classroom” hours. It doesn’t take into account the hours of homework and self study these students often do. For people like us, doing it on our own, we don’t have a classroom so just count the hours you spend with the language.

1 Like

I was thinking on this over the last week and on top of the two 'depends’s of:

  1. It depends on what languages you know
  2. It depends on what material you study with (domain-specific vocabulary or literature-specific tenses, etc.)
  • It also depends on how you study and how you use LingQ.
    For instance, over 600k of my 2.2M words read have been as subtitles to TV shows and movies, which I added to my words read total. That is, 1/4 of my words read have been as ‘extensive reading’ while not looking up words (which is probably slower in vocabulary acquisition per words read than say intensive reading while looking up words).

Compare this to @GMelillo, who drills flashcards alongside reading books. They also studied grammar extensively, so have no issue with different tenses, declensions, etc… This is study outside of LingQ, but is not easily recorded in LingQ’s statistics/recording system, so it’s hard to see.

Or maybe they are relistening to their studied material, which is another way to drill in the vocabulary. Whereas ‘read while listening’ may ‘double count’ in a sense, whereas reading, then listening, will result in more vocabulary learnt, but take more time to do.

  • It also depends how you record your statistics.
    Some people record their listening hours differently (eg. movies at a 50% reduction) or don’t include subtitles as words read. Or maybe they have different criteria for marking a word as Known. Everyone has their own system, which means that 2M or 4M words read and 30k or 50k Known Words for them may be slightly different for you. It’s not that all statistics are useless, but rather it’s hard to know which recording system you are using and others have their own systems.

I think, because of these variables of ‘it depends’, to stay on the safe side, you probably want to aim for a higher Words Read and Known Words count to be sure that you’ll achieve your goal of essentially becoming self-sufficient and self-improving in the language.

Probably the most sure-fire way is to pick a higher goal and test yourself along the way. Maybe every few months, try and pick up a paperback book and see if you can read it unaided without much difficulty.

This is exactly right. People tend to forget that counting always implies a conceptualization of the object counted. Stats are records of activity, and insofar as the activity differs the stats are going to accumulate uncertainty and inaccuracy in comparison.

My methodology for books that are above my current level is read → drill flashcards until words become level 4 but not “known” so that I still see underlined words → read while listening → listen repeatedly, going back over sentences that are hard for me to parse multiple times while listening → casually listen.

I do the same thing with podcasts, but I start with a cold listen → scan for blue words to ling without reading the transcript (unless the podcast has too many moments I couldn’t parse, in which case I will do targeted reading while listening → drill flashcards → listen casually.

I’ve just started doing this with television shows. Exact same methodology, except I have not needed to do the drilling of flashcards because the vast majority of words are known. With the TV audio in Lingq I do many repeated listens to any sentences that I do not parse effortlessly. Once I’ve done repeated close listens to the program I will watch it casually.

And I haven’t even described how I make words “known.” It’s silly to compare my words-known to yours if you have a different process for making a word “known.” Personally I think “known” is so poorly defined that I don’t bother trying to make my number accurate in some extra-Lingq sense. I have never once manually changed the status of a word. I let my work flow interact with the platform to determine what the “words known” number is, but I don’t think it is measuring much of significance for me.

For my project I am counting the total corpus of words-studied-in-books outside of Lingq. That number is about 2.5 million words studied with my approach. I try to estimate down the amount of time listened when there is fluff (non-linguistic input) in the material. I use podcasts as a kind of general exposure to the language, but I’ve also just started counting hours-of-television-programming studied as another metric because I think that effortless processing of television dialogue is a decent proxy for the conversational comprehension that is my next big goal.

Lingq is a self-study platform. Each user coming up with their own way to interpret the information that the platform provides is pretty essential, in my opinion.

Also, I know you were being polite since you only have my first initial - and you were engaging in a herculean task of interpretative modesty by not using my authorial voice as an input - but I’m a man.

The answers below give you a fair, quantitative range about the extensive reading condition to fluency, in my view.

However, I would like to insist on the cognitive learning process itself. Commitment to the below points will have a dramatic impact on your “cognitive productivity” and the number of words needed.

I rely on the work of a neuroscientist called Dehaene and what he describes as the 4 necessary pillars of learning:

  1. Attention. You need focus and freshness. Hence the superiority of morning routines for most. Active content selection helps a lot.
  2. Active engagement. Manipulate, use, re-use and reflect on the information acquired. In my case, being fond of etymology is of great help. Writing and speaking are immensely beneficial.
  3. Error feedback. An extraordinary aspect of our cognitive system is the importance allocated to error detection and management. For obvious Darwinian reasons. Applied to foreign language learning, it means that putting yourself to the test and failing will yield much greater dividends than monotonous effort without challenge. Exercise, detect and fix.
  4. Consolidation. Short term, it means no input overload and healthy sleep. Long term, the holy grail: perma-storage, when forgetting becomes unlikely. It requires long haul, spaced repetitions/exposure.