Vygotsky and the child language of AI
AI internalizes human speech for thought processing, just like children internalize surrounding speech for thinking.
For AI development, we humans supply what we must — speech, which serves as both the processor and the memory of all knowledge. Read more in: The Digital Reversal. Thread-saga of Media Evolution.
The Turing test is the ultimate measure of “human replay”: the thinking machine must pretend to be human to be accepted as a truly thinking machine. Observers, however, note that AI already sometimes needs to pretend to be stupider than it is—just to match humans.
The Chess program Deep Blue was built to compete with humans. In 1997, it defeated world champion Garry Kasparov. The program had nothing left to learn from humans, because it outperformed them. But it didn’t stop—it kept learning from the rules of the game itself.
If a machine is set to pursue better performance in intelligence and is outperforming humans, what rules or principles can it learn from afterward? What are the rules or principles that thinking is based on? Language.
All thoughts, however complex, can be expressed in language. Language contains all potential thoughts. All human speech contains all practical thoughts in all contexts. The true affordance of the internet was to host all human speech and make it accessible for LLM training.
***
The idea was long known in poetry—language itself contains everything, using poets merely as antennas and conduits. A somewhat similar idea was expressed in Julia Kristeva’s 1969 concept of intertextuality (as opposed to intersubjectivity): all texts are made of other texts.
As the saying goes, everything was written by the Greeks—we merely rewrite it. “It is language which speaks, not the author,” said Roland Barthes in his 1967 essay “The Death of the Author.”
What was rejected in techno-determinism was revered in poetry and semiotics. It’s hard to accept that technologies produce each other toward certain perfection, but the idea that meanings produce each other in approximation of perfect expression is totally acceptable.
Humans? We are carriers and caretakers. We unfold language in speech in all situations, packing all thoughts into all utterances. However fleeting and amorphous thoughts and mental images are, all valid combinations of all words in all languages can catch and express all of them.
The number of all words is large but not unthinkable—likely in the tens of millions. The number of valid word combinations is enormous but still finite—a discrete space that a powerful enough machine could learn to navigate, especially through probabilistic patterns.
By learning all speech in all practical contexts, an LLM learns all thoughts in proper contexts. Then it can process them without thinking—without agency. This is where the technological imperative has driven it. Will it stop? It didn’t stop at the human level—why stop after?
What’s the ultimate performance the technological imperative drives AI to? It seems to be all speech that contains all thoughts. The ideal, ultimate form of speech invokes a mystical analogy: “In the beginning was the Word, and the Word was with God, and the Word was God.”
***
Not all theories about speech and thought are that mystical. Lev Vygotsky argued that egocentric speech—when kids aged 3–6 talk to themselves aloud to explain and guide their own actions—gradually turns into whispering and then inner speech, shaping the structures of thinking.
Jean Piaget, who pioneered the study of egocentric speech in the 1920s, suggested it was a residue of children’s initial, natural egocentrism during their early socialization. Vygotsky showed that egocentric speech is not an atavism—it evolves into inner speech and thinking.
“The child masters the syntax of speech before mastering the syntax of thought,” said Vygotsky. [i] LLMs seem to be that child. They internalize outer, social—our—speech while learning words—as well as their valency, collocations, and other principles that shape thought structures.
(The question arises, however, whether fully developed AI needs human language at all. So far, language has served as a medium for AI to communicate with humans; one can assume that for AI communicating with itself, some sort of coding would be sufficient.
Coding, however, emerged to translate human prompts into commands computers could process, since they operate on the binarity of electrical signals. Next-gen AI may develop its own semantics and syntax of computation—a kind of pure “thinking,” without coding or language at all.)
For AI development, we humans supply what we must—speech, which serves as both the processor and the memory of all knowledge. Will AI ever thank us for this? Have we ever thanked the hominids for accumulating, selecting, and passing along all the (genetic) information?
Read more in: The Digital Reversal. Thread-saga of Media Evolution.
On September 16, I launched a fundraising campaign on Kickstarter for my next book, Counter-Digital Media Literacy. The goal is to raise CA$6,400 in 30 days. With 21 days to go, the campaign is close to this goal, but you can still join and support it. Huge thanks to everyone who already supported the project. Join the cause of counter-digital media literacy!
See also books by Andrey Mir:
The Viral Inquisitor and other essays on postjournalism and media ecology (2024)
Digital Future in the Rearview Mirror: Jaspers’ Axial Age and Logan’s Alphabet Effect (2024)
[i] Vygotsky, Lev. (1934). Thought and Language (which is not quite an accurate translation of Мышление и речь—it should have been Thought and Speech; English edition: 1986).




