Armchair Philosophy: "Chunking" might be why distant languages vocabulary is harder to learn

So something I have just recently noticed.
Russian has different “chunks” than say Spanish or French.

One thing I noticed when learning Spanish then French was that the speed of accumulation of new vocabulary was super fast compared to Russian. Even the non-cognate words in Spanish and French were faster to learn than any Russian word.

I have recently come to the tentative conclusion that part of the reason is “chunking”.

At the beginning I could neither replicate nor recognize chunks or the words that they make up e.g. vsz is a very unusual chunk in English or Spanish or French but not in Russian.

After some months, however, I believe my brain has formed memories of these “chunks” and so instead of it being an impossible to remember super long words of gibberish it might now come down to four recognizable chunks instead, which are easier to recall, making the entire word easier to recall. The frequency of the chunks is high enough that my brain has subconsciously formed memories of them.

As a result, suddenly I have noticed that it’s now easier to memorize and retain new vocabulary.

So a hypothesis: could it be that if you learn the chunks of a new language first that it might be quicker to pick up vocabulary?

I wonder if chunking is even a part of linguistics…

Hmmm. Lightbulb!!!
Pimsleur in fact uses chunks. Maybe it’s worthwhile doing Pimsleur first…


You mean learn chunks before the language itself? I think ot would be hard to process without something in memory to what you will attach them. I mean, there has to be some simple context at least.

I “lingq” phrases all the time. So in effect, I am doing some of this. However, I think there’s some things that are “chunk-able” and some, not as much. There are certainly patterns that come up over and over again and one should be trying to notice these, rather than individual words. (where it makes sense). It may be difficult to notice these at first though.


Yeah I don’t have an answer to this yet I just tentatively noticed it yesterday while I was walking the dog. But I think it’s at least a plausible explanation for some of the difficulty in retaining vocabulary in a distant language.

1 Like

Yes I mean exactly that. For Russian, in hindsight… I wonder what would have happened if I had taken e.g. the first 1,000 words in the frequency list and skimmed them for combinations of characters (e.g. three character pairs). Then did the same thing with English and then did a comparison to find the ones that were uncommon. Then just repeated them out loud like a baby babbling for a couple weeks to get used to them and form “muscle memory”. I wonder if I had done that would I have increased the ability to memorize and recall?

Example: russian has “v” and “c” supposedly isolated. But in actuality they are jammed onto either the preceding or following word. This forms in my mind a three character chunk. e.g. vsz or csz. Neither of those chunks appear in English. English however lots of (maybe wierd to non-english speakers?) two letter chunks like “nt” or “ng” or “nk” or “st” etc.

Anyhow, it’s maybe too late now with Russian because I’ve essentially already subconsciously memorized the chunks. But maybe I’ll do this with Mandarin first before I start my typical technique.

1 Like

Alright, I got it, yeah I think it could be helpful to get some training at the early stage, probably would have spared you some time and made the listening time more meaningful. Something like that was in my beginner course with explanations from teachers on some unusual sounds for Russian ear. Although it’s a bit not in line with you approach, but perharps being adult with the already depeloped native language you’re not able 100% purely reconstruct the way childern learn it. It has to be difficult to avoid any adjustments to the pure exposure to the TL.

I think the job is already done partially by a course.
I actually started the pimsleur course back at the beginning before I starting doing my usual approach. I didn’t like it because it seemed so time consuming the way he broke the words down into little chunks to make them easy to pronounce.

BUT… in hindsight that might have been an error. It might have been worthwhile to do at least the first level in order to build up muscle memory of the different chunks before exposing my brain to them en-masse.

At the time I had no explanation for exactly why the Russian words were so hard to retain. My first response (with no real evidence) was to say “well they’re just the same as french or spanish non-cognate words”.

This is kind of true. They are in one way like French or Spanish non-cognate words. There is no “hook” to map them back to.

BUT… in Spanish and French there are in fact already some hooks: the different chunks of the French and Spanish words are mostly similar to English chunks so they are likely probably somewhat easier to remember. At least that is my guess now that I have some hindsight.

So my new conclusion is that any other languages I’m going to tackle (definitely mandarin, maybe Egyptian arabic in 3 years or so) I think I will do at least one level of Pimsleur first.

Then I will be able to compare (because I already have the stats) with what happened during Russian to see if I get an improvement in retention.

Anyhow, just a nickel or two of random armchair philosophy and bar-room style speculation…

1 Like

Your idea is very good and it corresponds precisely with this book:


Thanks and thanks for the share.

I think this would be very difficult to do with mandarin considering that the script you’d be picking up isn’t a sound based script but a meaning based script. there are well over 10000 Chinese characters and many words that are audibly similar are only differentiated with tone. would be interested to find out if it helped though.