Vygotsky and the child language of AI
AI internalizes human speech for thought processing, just like children internalize surrounding speech for thinking.
For AI development, we humans supply what we must — speech, which serves as both the processor and the memory of all knowledge. Read more in: The Digital Reversal. Thread-saga of Media Evolution.
The Turing test is the ultimate measure of “human replay”: the thinking machine must pretend to be human to be accepted as a truly thinking machine. Observers, however, note that AI already sometimes needs to pretend to be stupider than it is—just to match humans.
The Chess program Deep Blue was built to compete with humans. In 1997, it defeated world champion Garry Kasparov. The program had nothing left to learn from humans, because it outperformed them. But it didn’t stop—it kept learning from the rules of the game itself.
If a machine is set to pursue better performance in intelligence and is outperforming humans, what rules or principles can it learn from afterward? What are the rules or principles that thinking is based on? Language.
All thoughts, however complex, can be expressed in language. Language contains all potential thoughts. All human speech contains all practical thoughts in all contexts. The true affordance of the internet was to host all human speech and make it accessible for LLM training.
***
The idea was long known in poetry—language itself contains everything, using poets merely as antennas and conduits. A somewhat similar idea was expressed in Julia Kristeva’s 1969 concept of intertextuality (as opposed to intersubjectivity): all texts are made of other texts.
As the saying goes, everything was written by the Greeks—we merely rewrite it. “It is language which speaks, not the author,” said Roland Barthes in his 1967 essay “The Death of the Author.”
What was rejected in techno-determinism was revered in poetry and semiotics. It’s hard to accept that technologies produce each other toward certain perfection, but the idea that meanings produce each other in approximation of perfect expression is totally acceptable.
Humans? We are carriers and caretakers. We unfold language in speech in all situations, packing all thoughts into all utterances. However fleeting and amorphous thoughts and mental images are, all valid combinations of all words in all languages can catch and express all of them.
The number of all words is large but not unthinkable—likely in the tens of millions. The number of valid word combinations is enormous but still finite—a discrete space that a powerful enough machine could learn to navigate, especially through probabilistic patterns.
By learning all speech in all practical contexts, an LLM learns all thoughts in proper contexts. Then it can process them without thinking—without agency. This is where the technological imperative has driven it. Will it stop? It didn’t stop at the human level—why stop after?
What’s the ultimate performance the technological imperative drives AI to? It seems to be all speech that contains all thoughts. The ideal, ultimate form of speech invokes a mystical analogy: “In the beginning was the Word, and the Word was with God, and the Word was God.”
***
Not all theories about speech and thought are that mystical. Lev Vygotsky argued that egocentric speech—when kids aged 3–6 talk to themselves aloud to explain and guide their own actions—gradually turns into whispering and then inner speech, shaping the structures of thinking.
Jean Piaget, who pioneered the study of egocentric speech in the 1920s, suggested it was a residue of children’s initial, natural egocentrism during their early socialization. Vygotsky showed that egocentric speech is not an atavism—it evolves into inner speech and thinking.
The logic is as follows: a child hears the speech of adults, then utters speech to describe their own actions out loud as if to themselves (egocentric speech), then whispers it, then finally internalizes it into inner speech, which serves as the grammar for thinking.
“The child masters the syntax of speech before mastering the syntax of thought,” said Vygotsky. [i] LLMs seem to be that child. They internalize outer, social—our—speech while learning words—as well as their valency, collocations, and other principles that shape thought structures.
Are we humans just stochastic parrots? In a sense, yes: early speech is quite random, as babies parrot adults—first words and then grammar patterns. That’s why children can say “I bringed a toy”—they parrot a correct grammatical pattern before learning specific exceptions.
They learn not only the grammar of word sequences but also pragmatics, or “referential grammar”: how an utterance, as a complex signifier, fits its context as a complex signified. These referential links are internalized by moving through and beyond stochastic parroting.
As a baby learning to speak and think, an LLM has an advantage: all words and grammar are available to it at once. What it needs is guidance in properly linking all utterances to all contexts, to develop stochastic parroting toward being less stochastic and less parroting.
***
The question arises, however, whether fully developed AI needs human language at all. So far, language has served as a medium for AI to communicate with humans; one can assume that for AI communicating with itself, some sort of coding would be sufficient.
Coding, however, emerged to translate human prompts into commands computers could process, since they operate on the binarity of electrical signals. Next-gen AI may develop its own semantics and syntax of computation—a kind of pure “thinking,” without coding or language at all.
For AI development, we humans supply what we must—speech, which serves as both the processor and the memory of all knowledge. Will AI ever thank us for this? Have we ever thanked the hominids for accumulating, selecting, and passing along all the (genetic) information?
Read more in: The Digital Reversal. Thread-saga of Media Evolution.
See other books by Andrey Mir:
The Viral Inquisitor and other essays on postjournalism and media ecology (2024)
Digital Future in the Rearview Mirror: Jaspers’ Axial Age and Logan’s Alphabet Effect (2024)
[i] Vygotsky, Lev. (1934). Thought and Language (which is not quite an accurate translation of Мышление и речь—it should have been Thought and Speech; English edition: 1986).





Good Q: "Will AI ever thank us for this? Have we ever thanked the hominids for accumulating, selecting, and passing along all the (genetic) information?" I think - and certainly hope, even if 'hope' is not a plan - that AI-s will treat us better than we treat our surviving relatives to common ancestors. Once AI-s are enlightened enough. Took us a long time for some moral progress. Even if we are still sorely wanting, inadequate, in many dimensions. Even if that 'progress' is always uncertain - was there any really, how much, and maybe we will re-lapse at any moment. I'd say progress is also that AI-s don't start from year-0, but start from us. I like Sutton re-popularising Moravec recently - "… our progeny, our 'mind children' built in our image and likeness, ourselves in a more potent form." Our language is our human history in a very compressed form. I'm always chuffed AI-s came about via language, text the entire humanity wrote word by word typing on their keyboard letter by letter. And publishing it on the Internet, to all to read, for free, out out their own volition, and maybe a primal desire to 'spread the word'. That married to contraptions made of sand, and used for playing games orginally! Game play being activity also done for the sake of it, and shared not only by all humans (esp when at their most honest - while kids), but by our relatives various animals. On the AI futures - I found this talk https://www.youtube.com/watch?v=ua67aXBP76k by Beren Millidge non-boring. (maybe you will too)