Noam Chomsky: The False Promise of ChatGPT

@Toby (noxialisrex):

“Though to be honest, the more language I have acquired, the less interesting the science is to me.”
For me it’s the other way around: the really “exciting” part are all the scientific disciplines that try to explain consciousness and communication coupled by all kinds of media (nonverbal media forms, orality, scripturality, the print media, all kinds of electronic media, money, etc.).

It’s similar to what happened in physics about 100 years ago.
That is, what we discuss nowadays in social complexity research has consequences similar to the ones in quantum physics (observer-dependent effects, etc.) that completely undermine common sense. And that’s exciting!

In contrast, the concrete language acquisition process is more like a “grind” for me: sometimes boring, sometimes interesting, and sometimes even enjoyable.

But it’s never intellectually stimulating like studying the scientific research that is related to those practical acquisition processes.

However, I admit it’s a kind of “acquired taste”… and many people are not interested in that, esp. when it completely challenges their common sense views…
Well, tbh, in the end, there’s not much left of “common sense” :slight_smile:

I am, without a doubt, a practitioner of language acquisition and cooking. Knowing this, I am keenly aware that there are ways I could be more efficient or precise in both if I was in a position to either read or perform actual research into the topics. I simply accept this fact, and my gumbo and schnitzel taste pretty good.

However, because of this I am limited to giving “recipes” for language acquisition which are unfortunately so tailored to myself that they are useless except in the broadest of strokes.

I like to think I have already “made it” for annoying Chomsky. My use of mood to make my English so precisely imprecise would drive him crazy, I think.


Are you saying that science nowadays embraces kind of phenomenological approach (at least in linguistics)? Because what’s left when common sens ends?
That would be exciting. I just can’t imagine how would orthodox scientific experiment and statistics yeild something of use when it comes to big complexity.

@Peter, in theory that is all really interesting to me, but alas no one is paying me to keep up with SLA research or study of linguistics. Were I to receive a generous grant to study and preserve indigenous languages, I would in a heartbeat. But because I am so mastery driven, I can be content solely with self-improvement and “the grind”.

Also dunking on Universal Grammar is fun.

So like, is it just me or has the LingQ comment field been broken for awhile? Is everyone else just writing the comment in OneNote and pasting them into the comment box?

Agreed 100% by the way. ML or Narrow AI is not a replacement for human intelligence and critisizing it for not is kind of missing the point. The fact that it can so accurately create human like text simply by predicting the next word, and divorced of all semantics, is interesting and potentially useful in its own right.

No strongly disagree. It has been very useful to me. That is my own opinion of course. I use it to brain storm and come up with ideas that I would never think of other wise. I think offends some people because they frankly are afraid of it. Yes every technology can be used in a wrong way. That is just the sad fact of life. That is all I have to say. Do not expect any further comments on it from me. Just my opinion and that is where I will leave it.

Thanks for this illustration of true human intelligence.

But be careful. With great power comes great responsibility. Wait for the green light while crossing the street.

While attempting to gain a better understanding of Chomsky’s position, I came across Peter Norvig’s response to a 2011 statement made by Chomsky on statistical learning, which I found quite illuminating: On Chomsky and the Two Cultures of Statistical Learning
I especially liked the parallel to Plato’s allegory of the cave: “Chomsky thinks we should focus on the ideal, abstract forms that underlie language, not on the superficial manifestations of language that happen to be perceivable in the real world. That is why he is not interested in language performance.”

Here is the original 2011 statement:

As a layperson, attempting to comprehend the function and mechanics of ChatGPT can be challenging. However, I came across an insightful explanation by Stephen Wolfram that avoids oversimplification while not assuming advanced mathematical knowledge: What Is ChatGPT Doing … and Why Does It Work?—Stephen Wolfram Writings

Here are some snippets from his conclusion:
‘But the remarkable—and unexpected—thing is that this process can produce text that’s successfully “like” what’s out there on the web, in books, etc. And not only is it coherent human language, it also “says things” that “follow its prompt” making use of content it’s “read”. It doesn’t always say things that “globally make sense” […] it’s just saying things that “sound right” based on what things “sounded like” in its training material. […]
But ultimately […] ChatGPT is “merely” pulling out some “coherent thread of text” from the “statistics of conventional wisdom” that it’s accumulated. But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought. ChatGPT has implicitly discovered it. […]
What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar.’

“that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought.”
No, probably not.
ChatGPT does not really process “language” because without the “meaning” dimension, there is simply “no” human language.

And, by the way, modern science in the 20th century has completely failed (it doesn’t matter which scientific discipline we’re talking about) in “determening the meaning” of words / sentences once and for all. That is:

  1. The meaning of words / sentences is completely dependent on contexts.

  2. Contexts can never be closed, i.e., there are neither absolute nor final contexts.

  3. If we short-circuit 1) + 2) with each other, the consequence is: Contexts are always open, ergo; the meaning of words / sentences is always open, sensu: constantly shifting as well.
    However, there are also some semantic aspects (“semes”) that are “more stable” when switching contexts. So, we have to consider two aspects at the same time:

  • a radical openness of contexts and, therefore, meanings of words / sentences used in communication processes
  • a kind of relative semantic stability across various contexts.

That’s basically the main idea of Derrida’s (non-)concept of “iterabilité” = the non-identical reproduction of words / sentences in always changing contexts.

This means that the idea of an “absolute” (true, valid, etc.) interpretation of a religious or any other text is completely absurd. If there were such a thing, language would implode in an instance → no context, no language, no consciousness, no human communication for coordinating behavior: just a black hole of media nothingness.

All kinds of scientific disciplines (without exception) had to learn this lesson in the 20th century - esp. after the collapse of (linguistic) structuralism in the late 1960s with the rise of “post-structuralist” and “difference-based” approaches (Jacques Derrida, Michel Foucault, Gilles Deleuze, Niklas Luhmann, Spencer Brown / Dirk Baecker, etc.).

In short, the “meaning” dimension of human language processing is an extremely “slippery beast”, esp. from a scientific point of view.

However, here “machine learning” comes into play: This branch of AI tends to circumvent the “slippery meaning beast” by just focusing on the mathematical and statistical processing of patterns in big data. And the really astonishing fact is that this simulation of language processing is sometimes so good that we humans think it’s like the real deal, i.e., the human processing of language.

Of course, that’s not completely the case because human minds can both “surf” on the associative waves of all kinds of sensory, linguistic, and non-linguisitic media forms and “switch” between literal and figurative interpretations in the blink of an eye:
no animal and no AI is - at least at the moment - able to match that.

And therefore my favorite AI “torture sentence” is: “Earth is a blue orange. Why is that?”

ChatGPT: “No, that’s not the case. Earth is the third planet seen from the sun (Wikipedia bla bla bla). It’s not an orange, it’s a planet, a planet, a planet, etc.” :slight_smile:

Here the real chat with the AI ends and the fictional part begins:

Peter: “Yes, it can be seen as an orange once you switch to non-literal interpretations. And that’s how humans can process “any” media form. Therefore, ChatGPT, you’re still nothing but a text generator with formulaic responses…”

ChatGPT: “Let’s talk again when I’m connected to human brains.”

Peter silent and thinking: “Yes, that might be the end of the antropocene age as we know it…” (see Harari’s"Homo Deus", for ex.).

Ooops, that’s the “real” comment I wanted to post (the other one was just a “digression” :slight_smile: ).

The human brain can be seen as a “prediction machine” as well:
“For each word or sound, the brain makes detailed statistical expectations and turns out to be extremely sensitive to the degree of unpredictability: the brain response is stronger whenever a word is unexpected in the context.”

So, we could say we have (at least) two prediction machines competing with each other when it comes to language:

  • A biological prediction machine that processes “meaning” and relies on “small” data.

  • A non-biological prediction machine that does not process “meaning” and relies on “big” data.

And as I wrote reg. ChatGPT:
It gets really interesting once we couple both prediction machines. But this may be the beginning of another species that we shouldn’t call “Homo Sapiens” any more…

And if some of you think that’s SciFi / cyberpunk stuff, then you should think again because “Neuralink” is already here (to stay):

At least the LingQ forum SW is “highly predictable”: After a few nested comments, it basically becomes unusable :slight_smile:

Therefore, I’m posting here right at the beginning:
"I am, without a doubt, a practitioner of language acquisition and cooking. Knowing this, I am keenly aware that there are ways I could be more efficient or precise "

A higher degree of “efficiency” is not the point in this context (but, of course, it’s important for us as language learners!).
The problem is rather: Pure SLA practitioners are not really able to explain how language processing works.
In other words, the line between “common sense” (as a kind of simplistic background knowledge full of stereotypes, clichés, biases, etc. that is ok for everyday communication) and “common nonsense” is extremely thin.

Take, for example, the position that reading and listening are “passive” compared to “speaking and writing”, which are seen as “active”. That’s utter nonsense. Both the brain and the mind (i.e., the consciousness) are always active (which includes a high degree of focused attention) when reading / listening - otherwise, there would be nothing.

And that’s just a consequence of the failed application of the technical sender-receiver model (where sending data is seen as “active” and receiving data is seen as “passive”) to human communication processes.

And that’s only one example. There are many, many more when it comes to SLA discussions. I could write a whole book about such SLA “myths”, which, probably, no one would want to read :slight_smile:

But, and that’s a big BUT:
Having a better (and this means a more “scientific”) understanding of the interplay of the brain, the mind (i.e., the psyche and the consciousness), the processing of linguistic / non-linguistic media forms, and communication doesn’t make SLA practitioners automatically better language learners.

So, the dilemma is:
People can provide excellent models about language processing (such as Schmid’s “The Dynamics of the Linguistic System”) - and still be unsuccessful SLA practitioners.

Or the other way around: People can be highly successful SLA practitioners, but their common-sense “explanations” of how language is processed, how the mind and communication operates, etc. are more or less uninteresting.

This doesn’t have to be always the case (a famous counter-example is the application of
sociological systems theory to organizational consulting), but often science is too abstract for practitioners or, vice versa, the practical problems learners have to deal with are not interesting enough for advanced science.

Applied linguistics tries to be both at the same time, but I’m not sure how successful it is in this endeavour…


“Are you saying that science nowadays embraces kind of phenomenological approach (at least in linguistics)?”

I just want to say that it’s interesting to “couple” different approaches at the moment:

  1. Communication research, esp. based on difference-based systems theory, which also borrows the concept of “meaning” (in German: “Sinn”) from Edmund Husserl’s phenomenology.
  2. Social network analysis
  3. Usage-based (anti-Chomskian) linguistics
  4. Machine learning

It’s hard to say what the scientific outcome of these “hybridizations” will be. I find it “exciting”, but I’m not an oracle so I can’t say how successful this can be. At least, these are alternatives to Chomsky’s “Universal Grammar” because they are “bottom-up” approaches…

“Because what’s left when common sense ends?”
Well, common sense knowledge, strictu sensu, can never end because it’s a necessary background for all everyday communication processes.

However, scientific reflection tends to differentiate from common sense again and again to build specialized, highly improbable (because abstract, complicated and complex) knowledge.
In other words, the countless tensions between laypersons and specialists are an all-encompassing characteristic of modern society (since ca. 1830): There’s probably not a single societal domain left where these tensions aren’t relevant…

Or to put it differently: The success story of modern society depends to a large degree on “specializations” - and this poses a variety of challenges and problems for everyday lay communication…

Universal Grammar
The universal structures of natural human languages exist, though not necessarily innate by their nature when we are born. Hangul is similar to the Latin alphabet per human design in its simplistic form. A good question might be why human languages did not evolve to become close to the Heptapods used in the movie Arrival, a modern programming language, or even the alphanumeric code Borg use. A telepathic race could have used a light wave pattern as their writing system per sec. Isn’t the pidgin or creole languages evolving within a predefined language paradigm? Overall, the theory of Universal Grammar, or loosely connected omnipresent components of human languages, applies more to the development of languages than the schematic part of language acquisition, as the governing rules bind it and are subject to dynamic changes simultaneously.

Universal translator and set theory.
The universal translator has always been a myth on the ground that the inability of a human to translate precisely every possible sentence from one language to another, much less of an AI, takes the form of an electronic device from the movies. Any human language will have a subset of linguistic features overlapping with other languages. An experienced language learner adept in multiple languages can associate things that may not be common in syntax, semantics, etc., in a different language. A good example might be the humbling words to show reverence to someone with a higher social status in the Korean language. No linguistic rule will encompass all known human languages except for the common phenomenon of unpopular grammatical structures, words, etc., to be phased out and replaced by alternative ones throughout the evolution of the language.

Fuzzy logic, machine learning, statistical model vs. human reasoning
The cause-and-effect reasoning is another fancy term for common sense, which ordinary folks have relied on since the dawn of civilization, albeit prone to errors from time to time. We find other more rigorous forms of reasoning essential in the advances in modern science and well-known philosophers’ great endeavors since ancient times. AI adapts the concept of fuzzy logic, which deals with an imprecise spectrum of data, in addition to boolean logic with absolute certainty or uncertainty. It has its merit in solving problems in speech therapy and decoding ancient unknown scripts from a relic, etc. However, the developer must deal with critical and fundamental issues for any specific real-life application.

A complex social issue with AI as the solution
AI allows bereaved family members to talk with virtual versions of deceased loved ones. This encounter may alleviate the family’s grief and complement the condolence from friends and relatives. At first, AI may have to assign words or phrases with a value on a weighted scale to offer an appropriate comfort corresponding to the bereaved’s emotion, which may vary in degree of sorrow. Secondly, how can an AI detect a feeling if they are naturally deprived of emotion and morals? They would have great trouble telling a tear of joy from a sorrowful one, let alone dealing with the mixed feeling from the following quote: The Great Gatsby.

Only Gatsby, the man who gives his name to this book, was exempt from my reaction—Gatsby, who represented everything for which I have an unaffected scorn.

Hi llearner,

“The universal structures of natural human languages exist,”
It’s a “hypothesis”, not some kind of absolute truth.

And at the moment I think this position that one can observe strong tendencies (PB: “regularities”) across various languages is more plausible:

“Though there has been significant research into linguistic universals, in more recent time some linguists, including Nicolas Evans and Stephen C. Levinson, have argued against the existence of absolute linguistic universals that are shared across all languages. These linguists cite problems such as ethnocentrism amongst cognitive scientists, and thus linguists, as well as insufficient research into all of the world’s languages in discussions related to linguistic universals, instead promoting these similarities as simply strong tendencies.”

“Secondly, how can an AI detect a feeling if they are naturally deprived of emotion and morals?”
Connect them to our brains and give them physiological data to process.
The more data, the better. Then determining specific feelings might just be a
matter of probabilities…


I am seemingly unable to write a comment on my phone, but this popped up in my feed yesterday and I found it pretty insightful.

Main idea of Chomsky’s “Universal Grammar” (motto: “Chomsky in one minute” :-)):

!Noam Chomsky’s work related to the innateness hypothesis as it pertains to our ability to rapidly learn any language without formal instruction and with limited input, or what he refers to as a poverty of the stimulus, is what began research into linguistic universals. This led to his proposal for a shared underlying grammar structure for all languages, a concept he called universal grammar (UG), which he claimed must exist somewhere in the human brain prior to language acquisition. Chomsky defines UG as “the system of principles, conditions, and rules that are elements or properties of all human languages… by necessity.”[5] He states that UG expresses “the essence of human language,”[5] and believes that the structure-dependent rules of UG allow humans to interpret and create infinite novel grammatical sentences. Chomsky asserts that UG is the underlying connection between all languages and that the various differences between languages are all relative with respect to UG. He claims that UG is essential to our ability to learn languages, and thus uses it as evidence in a discussion of how to form a potential ‘theory of learning’ for how humans learn all or most of our cognitive processes throughout our lives. The discussion of Chomsky’s UG, its innateness, and its connection to how humans learn language has been one of the more covered topics in linguistics studies to date. However, there is division amongst linguists between those who support Chomsky’s claims of UG and those who argued against the existence of an underlying shared grammar structure that can account for all languages."

And here’s the counterposition to “linguistic universals”:

“Nicolas Evans and Stephen C. Levinson are two linguists who have written against the existence of linguistic universals, making a particular mention towards issues with Chomsky’s proposal for a Universal Grammar. They argue that across the 6,000-8,000 languages spoken around the world today, there are merely strong tendencies rather than universals at best.[11] In their view, these arise primarily due to the fact that many languages are connected to one another through shared historical backgrounds or common lineage, such as group Romance languages in Europe that were all derived from ancient Latin, and therefore it can be expected that they share some core similarities. Evans and Levinson believe that linguists who have previously proposed or supported concepts associated with linguistic universals have done so “under the assumption that most languages are English-like in their structure”[11] and only after analyzing a limited range of languages. They identify ethnocentrism, the idea “that most cognitive scientists, linguists included, speak only familiar European languages, all close cousins in structure,”[11] as a possible influence towards the various issues they identify in the assertions made on linguistic universals. With regards to Chomsky’s universal grammar, these linguists claim that the explanation of the structure and rules applied to UG are either false due to a lack of detail into the various constructions use when creating or interpreting a grammatical sentence, or that the theory is unfalsifiable due to the vague and oversimplified assertions made by Chomsky. Instead, Evans and Levinson highlight the vast diversity that exists amongst the many languages spoken around the world to advocate for further investigation into the many cross-linguistic variations that do exist. Their article promotes linguistic diversity by citing multiple examples of variation in how “languages can be structured at every level: phonetic, phonological, morphological, syntactic and semantic.”[11] They claim that increased understanding and acceptance of linguistic diversity over the concepts of false claims of linguistic universals, better stated to them as strong tendencies, will lead to more enlightening discoveries in the studies of human cognition.”

Well, I clicked on Liked carelessly. lolz. We have good instances with lie detectors. As sophisticated living beings, humans are good at hiding our genuine emotions sometimes. You are not suggesting living in the Matrix, are you?

I don’t entirely agree with Chomsky’s “Universal Grammar.” I prefer the following quote in a broader sense, our languages included.

“Humanity has more in common than the differences that separate us.”
― Tom Giaquinto, Be A Good Human

Sabine is awesome - and she seems to know a lot about quantum physics, but definitely not enough about human language processing because

  1. it’s not simply about “factual information”. We have to explain the figurative (esp. metaphorical) and associative aspects - right from the start.

Tip: It might help to read the books of Doug Hofstadter (again) in this context to understand what I mean here…

  1. the exclusive focus neither on the brain nor the human mind is sufficient any more. We have to expand our perspective and include the “social” dimension, which transcends a purely individualistic point of view. That’s were socio-emergent communication and sociology starts and the sender-receiver / input-output model and individualistic psychology or neuroscience ends.

So, my thesis is:
An AI will develop neither consciousness nor socio-emergent communication, but
rather a genuine “machine intelligence”.

And that’s useful because we don’t need simple copies of conscious or communicative mechanisms that have evolved over hundreds of thousands of years and work extremely well. We need something else…

“humans are good at hiding our genuine emotions sometimes”
But hiding something in communication processes is one thing.
Hiding something in our “physiological infrastructure” that an AI
can access is a different matter.

“I prefer the following quote in a broader sense, our languages included”
I doubt that personal preferences based on quotes are valid arguments pro
or con UG :slight_smile: