Noam Chomsky: The False Promise of ChatGPT

Dear LingQ members,
As many of you have taken a keen interest in the recent developments in the realm of machine learning, especially ChatGPT, I wanted to start a discussion on the fundamental limitations of this promising but still emergent technology. As a community with a more than average interest in language, I am curious to hear your thoughts on the potential drawbacks and limitations of large language models.

Coincidentally, this week world-renowned linguist Noam Chomsky published an op-ed in the New York Times that is quite critical of these recent developments. Do you agree with Chomsky’s views or do you have a different perspective?

Here is a link to the article:

The article discusses the limitations of machine learning programs like OpenAI’s ChatGPT. Despite their ability to generate human-like language and thought, these programs differ significantly from the human mind. The latter operates efficiently with small amounts of information and seeks to create explanations rather than inferring brute correlations among data points. Large language models focus on description and prediction, rather than causal explanation, which is the mark of true intelligence. They also lack the ability of moral thought.


With all due respect to the author, this comparison with human “true” intelligence seems kinda lame to me. And some nonesense in the article, really. Like this one:

When linguists seek to develop a theory for why a given language works as it does (“Why are these — but not those — sentences considered grammatical?”), they are building consciously and laboriously an explicit version of the grammar that the child builds instinctively and with minimal exposure to information. The child’s operating system is completely different from that of a machine learning program.

I don’t even want to comment on this one.

Indeed, such programs are stuck in a prehuman or nonhuman phase of cognitive evolution. Their deepest flaw is the absence of the most critical capacity of any intelligence: to say not only what is the case, what was the case and what will be the case — that’s description and prediction — but also what is not the case and what could and could not be the case. Those are the ingredients of explanation, the mark of true intelligence.

Chomsky probably has an incredibly intelligent (on average) circle of people around him. 95% of the population can do nothing of these. Or at best some of it. And only in a limited number of subjects.

But an explanation is something more: It includes not only descriptions and predictions but also counterfactual conjectures like “Any such object would fall,” plus the additional clause “because of the force of gravity” or “because of the curvature of space-time” or whatever. That is a causal explanation: “The apple would not have fallen but for the force of gravity.” That is thinking.

This is the matter of data sets. We have the luxury of multimodal data sets to prove or disprove any magical thinking theories to adjust our knowledge of the world. And yet so many of people in their adult lives believe in horoscopes and overall prone to magical thinking, no matter what. So, the point isn’t taken.

Whereas humans are limited in the kinds of explanations we can rationally conjecture, machine learning systems can learn both that the earth is flat and that the earth is round. They trade merely in probabilities that change over time.

Humans believe in both science and in God. And even flat-earthers believe in science, but deny some parts of it to fit it to their beloved theory. In general, people have a lot of messed up and conflicting beliefs, preconceived notions in their minds.

It’s again the matter of data sets. We can teach our brother human to be like this and to believe in opposite things. And vice versa, we can train a language model only on a scientifically proven data set.

Language models were never supposed to be AI in the general sense, anyway.

Human intelligence is overrated, very limited and ultimately finite. All we live off the successes of a few geniuses throughout the history, layered on top of each other, short moments of when these great people came up with something genius.

Hi Florian,

“Despite their ability to generate human-like language and thought, these programs differ significantly from the human mind.”
The beauty of machine learning based on mathematics and statistics is exactly that:

it does “not” process language like human minds do because it completely avoids the “meaning” dimension.
However, a natural language consists of tens of thousands of “form - meaning” pairs (usually called “signs”, etc.). Ergo, machine learning is “not” linguistic processing similar to human minds.

Astonishing is rather in this context:

  1. You can simulate language processing without using meaning, just the mathematical and statistical processing of patterns in large datasets. In this sense, this machine-based simulation is non-linguistic language processing.

  2. You can simulate language processing by avoiding consciousness as well (surfing in associative networks, the seamless switching between literal and figurative interpretations , etc.)

  3. You can also simulate human communication processes by avoiding the coordination of behavior via socioemergent communication.

In this sense, these simulations based on the mathematical and statistical processing of patterns are threefold:

  • a-linguistic (no human language because there is no meaning processing)
  • a-conscious (there is no consciousness involved)
  • a-communicative / a-social (there is no socio-emergent communication involved)

and the results are still nothing but “impressive”.

In short, AI based on ML should rather be seen as the (rudimentary) evolution of a genuine machine intelligence that does not process (human) language and has neither conscious nor communicative (in a socio-emergent sense) characteristics.

You could say that this type of intelligence is complelely “a-human”, but still capable of providing solutions that we humans not always, but more and more often consider “intelligent”.

In other words, this type of machine intelligence is “a-human” because it lacks the three dimensions that make humans human: language processing based on form-meaning pairs, consciousness and coordination of behavior via socio-emergent communication.


“this comparison with human “true” intelligence seems kinda lame to me.”
The “interesting” aspect of artificial / machine intelligence is exactly that: it’s “a-humaness”. And therefore, it does not replace human intelligence, it just complements it - at least for now.


Re: “[…]the child builds instinctively and with minimal exposure to information[…]”
I think this is a reference some concepts that are closely linked to Chomsky, the linguist, for example: Universal grammar - Wikipedia
The theories of another linguist who is held in high esteem in these forums, Stephen Krashen, are often understood to be built on Chomsky’s notion of universal grammar. The idea is that all humans have a “language acquisition device” that allows them to unconsciously acquire the rules of a second language through exposure to comprehensible input.
So quite interesting that you object to this particular passage. But I have to say, I’m skeptical myself.


“The idea is that all humans have a “language acquisition device” that allows them to unconsciously acquire the rules of a second language through exposure to comprehensible input.”
Nowadays, we can attack every concept in this sentence:

  • the idea of “humans” (as well as “geniuses”, see SI’s comment above) - esp. from a humanist perspective
  • the idea of “language acquisition”
  • ithe idea of such a linguistic “device”
  • the idea of the “innatedness” of such a language device
  • the idea of an “unconsciousness”
  • the idea of “rules”, esp. as rule-based language learning / processing
  • the idea of “input” (including the whole sender-receiver and input-output model of communication for coordinating human behavior)


But I’d say we would have to dive really “deep” into science for that - and this forum is not the right place for that…

I object to the “minimal exposure” mostly. But, obviously, they do it instinctively, I can’t deny this.
Also, saying that children are processing language because they have different “operating system” is very simplistic. I doubt the OS terminology exactly applies to either AI or, more over, to human intelligence. It’s literally building “operating hardware” rather than some sort of software. Even if it’s conveniet way to think about it.
We’ve got used to considering our thoughts as some lines of a programm or script. But behind any of our processing lies specific populations of neurons and connections between them. Unlike how it is in a computer, where any line of code makes requests to universal computing architecture, like a Processor, RAM and Data storage.
In other words, each time aquiring a skill, we’re building specific hardware, rather than software. That’s why we can’t just download a skill to our mind. Even something purely theoretical needs time to become our personal effective knowledge.

If some of you want to know alternative approaches to Chomsky’s Universal Grammar, then starting with this handbook could be a good idea:

“The Handbook of Language Emergence
Brian MacWhinney, William O’Grady
First published:2 January 2015”


Btw, socio-emergency is more of a problem of AGI (Artificial General Intelligence). I guess the “Decision-making under uncertainty” thing is all about this. Because there’s no way of evaluating any decision if not by consequences for the makers and those around them. Without socio-emergency, we wouln’t be better at decision-making under uncertainty than some random machine.

Well, I draw here on some insights of social complexity research in particular and the paradigm of complex (adaptive) systems in general when refering to “social emergence”.

The important point here is: an “individualistic” perspective focusing on individual human minds (psyche / consciousness) is obsolete.
In other words, the evolution of consciousness is closely related to communication as an emergent dimension that deals with coordinating behavior.

We can also say that communication comes first because a human baby may be born with a psyche for processing sensory information, but not with a conciousness.
Only when a human psyche is confronted with communication processes over an extended period of time does a language-based consciouness pop up (at least that’s the main hypothesis of approaches such as social systems theory).

However, I agree that this view is more related to the evolution of a general AI. But even a general AI won’t communicate in a human sense…

Anyway, what I wanted to stress in my comment above is this: when it comes to AI, we really have to deal with something “alien”, i.e. “a-human”. Otherwise, we constantly tend to create anthropomorphic projections without much intellectual value…

1 Like

To me, the connection between Chomsky and the language learning community is Krashen; who is basically Chomsky applied to SLA. Pretty much, wherever I look, Krashen is cited, quoted, venerated. Lots of language sites, programs and gurus invoke Krashen whenever they want to give their own hypotheses academic credence. I see that Language acquisition device - Wikipedia says that this “device” is claimed to be “pseudoscience” or that the “generative grammar” has been “rejected”. So their durability in language learning circles may appear surprising. But that’s obviously not directly related to the topic.


“rather than causal explanation, which is the mark of true intelligence.”

“Causality” is just one of the attempts human minds use to navigate their environments.
However, causal thinking often fails because

  1. there may be too many causes
  2. there may be too many effects
  3. there may be too many relationships between possible causes and possible effects

Or to put it differently (again drawing on insights from complexity science):
Causality only works well for “simple” systems. It often fails for “complicated” systems.
And it completely collapses when dealing with “complex” systems, which are per se unpredictable, so that neither causal thinking nor statistics can be used here.

So you could rephrase the quote mentioned above:
“… causal explanation is the mark of an intelligence that deals with simple systems / phenomena - and that’s the uninteresting part of intelligence” :slight_smile:

If you remember, Toby wrote in one of our previous threads (with hellion) that the riddle of second language acquisition was “solved” and we shouldn’t reinvent the wheel.

I’d say: YES and NO.

Yes - when it comes to many “practical” strategies such as read / listen a lot, etc.

No - when it comes to scientific explanations of what is involved here: consciousness, language / media, and communication.
That’s still “highly controversial” - and there’s no “definite” answer in sight (but, ok: there are “no” definite answers in science, anyway, just falsifiable hypotheses that are still valid).

Language learners who have no background knowledge in linguistics, communication research, psychology, neuroscience, etc. tend to mix both dimensions, i.e., practical SLA and scientific SLA, so that the impression is created that we can really “explain” everything that is going on here.

“So their durability in language learning circles may appear surprising”.
No, language learning communities are rather common-sense based (with many confirmation biases). They tend to be “decades” behind what is being discussed in SLA-related scientific disciplines… Ergo, there is no surprise here :slight_smile:


For LingQers that are interested in Chomsky’s ideas and why new (usage-based) approaches abandoned his ideas, see:

Personally, I’d combine this with social complexity research. The result is something that goes far beyond common sense discussions about language / media, consciousness, communication, language learning, reality construction …

Do you remember the literature concerning Jacques Derrida that I recommended to you a few months ago? Among them was Thomas Khurana’s book “Sinn und Gedächtnis”. It is one of the more advanced approaches that are relevant in our context. See:

Have fun


Will: “Can a robot write a symphony?”

AI: “Sure, pal. I can create millions of them. Where’s the challenge?”

Peter: “Oh, Will. You sound so 1950 - I’m really depressed now :-)”

1 Like

So far they can’t. Symphony as a musical form implies there have to be dramaturgy, some kind of a “story”. And a “musical story” isn’t so a trivial task for AI. But nothing impossible, I’m sure we will see HansZimmerAI at some point :slight_smile:

I don’t know how good AIs are at writing symphonies at the moment.
But an AI finished Beethoven’s last unfinished symphony in 2021:

AI: “Hans, we’re coming for you as well!” :slight_smile:

However, IMO, that’s no problem because the most promising combo right now is: “humans + AIs”. But this might change in the future, esp. when they are connected to our brains…

1 Like

Wow! That’s cool, didn’t know about this project. Sounds really nice and authentic!

RE: solved, I agree. What I mean by solved is the methods and strategies that are effective for adult language learners. The science and mechanisms of language acquisition are by no means solved, for children or adults.

Though to be honest, the more language I have acquired, the less interesting the science is to me. In fairness to me, my lack of interest is probably directly related to Chomsky and Krashen. A lot of pop science these days seems to want to have simple digestible models, because it makes things easy, irrespective of the fact that reality is far more complex.

A fun factlet about me - a “life goal” of mine is to have an idiolect of English that Chomsky would viscerally react to. Preferably, disapproval.


…that Chomsky would viscerally react to. Preferably, disapproval.

The disapproved idiolect could be like:
As an AI language model, I’m…

A lot of pop science these days seems to want to have simple digestible models, because it makes things easy, irrespective of the fact that reality is far more complex.

I can so much relate to this. There are so many things in which science will always tail behind the actual experience of practitioners who humble themselves about unerstaning of things, but explore and use the possibilities in the genuine spirit of knowledge and delving right in the thick of things. People started to use yeast and fungi long before they got to know what it is. Or what can scientific model tell chess players of how to play chess, other than “think as many steps ahead as you can”, well, I mean, thanks :slight_smile: