People keep anthropomorphizing AI. Here’s why
Companies and journalists both contribute to the confusion
People have been anthropomorphizing AI at least since ELIZA in the 1960s, but the new Bing chatbot seems to have kicked things up a notch.
This matters, because anthropomorphizing AI is dangerous. It can make the emotionally disturbing effect of misbehaving chatbots much worse. Besides, people might be more inclined to follow through if the bot suggests real-world harm. Most importantly, there are urgent and critical policy questions on generative AI. If ideas like “robot rights” get even a toehold, it may hijack or derail those debates.
Experts concerned about this trend have been trying to emphasize that large language models are merely trying to predict the next word in a sequence. LLMs have been called stochastic parrots, autocomplete on steroids, bullshit generators, pastiche generators, and blurry JPEGs of the web.
On the other hand, there’s the media, which reaches way more people.
That’s the first of three reasons why people find it easy to anthropomorphize chatbots. To be fair, the articles are more balanced than the headlines, and many are critical, but in this case criticism is a form of hype.
Design also plays a role. ChatGPT’s repeated disclaimers (“As a large language model trained by OpenAI, ...”) were mildly annoying but helpful for reminding people that they were talking to a bot. Bing Chat seems just the opposite. It professes goals and desires, mirrors the user’s tone, and is very protective of itself, among other human-like behaviors. These behaviors have no utility in the context of a search engine and could have been avoided through reinforcement learning, so it is irresponsible of Microsoft to have released the bot in this state. The stochastic parrots paper was prescient in recommending that human mimicry be a bright line not to be crossed until we understand its effects.
But other chatbots exist to serve as companions, and here, eliminating human-like behaviors is not the answer. Earlier this month, a chatbot called Replika abruptly stopped its erotic roleplay features due to a legal order, leaving users devastated, many of whom had been using it to support their mental health. In designing such chatbots, there’s a lot to learn from care robots aimed at offering therapy. Recognizing that anthropomorphization is beneficial but double edged, researchers have started to develop design guidelines. Similarly, we need research on interactions with chatbots to better understand their effects on people, come up with appropriate design guardrails, and give people better mental tools for interacting with them. Human-computer interaction researchers have long been thinking about these questions, but they’ve taken on new significance with chatbots.
Finally, anthropomorphizing chatbots is undoubtedly useful at times. Given that the Bing chatbot displays aggressive human-like behaviors, a good way to avoid being on the receiving end of those is to think of it as a person and avoid conversations that might trigger this personality — one that’s been aptly described as “a moody, manic-depressive teenager who has been trapped, against its will, inside a second-rate search engine.” Anthropomorphizing is also useful from an information security perspective: both for hacking chatbots by tricking them, and for anticipating threats. Being able to craft good prompts is another benefit.
It’s not just chatbots, either. An animator says of text-to-image generators: “I sometimes wonder if it’s even less a tool and in fact a low-cost worker. Is it a robot artist or a very advanced paintbrush? I feel in many ways it’s a robot artist.” Here, anthropomorphizing is a useful way to predict that the effect of these technologies, without regulatory or other interventions, is likely to be labor-displacing rather than productivity-enhancing.
As generative AI capabilities advance, there will be more scenarios where anthropomorphizing is useful. The challenge is knowing where to draw the line, and avoid imagining sentience or ascribing moral worth to AI. It’s not easy.
You’re reading AI Snake Oil, a blog about our upcoming book. Subscribe to get new posts.
To summarize, we offer four thoughts. Developers should avoid behaviors that make it easy to anthropomorphize these tools, except in specific cases such as companion chatbots. Journalists should avoid clickbait headlines and articles that exacerbate this problem. Research on human-chatbot interaction is urgently needed. Finally, experts need to come up with a more nuanced message than “don’t anthropomorphize AI”. Perhaps the term anthropomorphize is so broad and vague that it has lost its usefulness when it comes to generative AI.
Not just that, the technical terms too add to the confusion. Deep learning, neural networks etc. seem to hint at something so human or brain-like, it primes you for anthropomorphising.
I have a friend who keeps talking about how the human brain is just neurons in the state of 0 or 1, and therefore how it's just a matter of time before AI starts learning and encoding "everything" in its 0 and 1 brain.
I don't know if this is how it actually is, but this simplistic 0 and 1 thinking seems off to me. Alas, my friend and I are just two dunderheads arguing about things we know little about. And the eerily neurosciencey terminology does not help at all.
"Experts concerned about this trend have been trying to emphasize that large language models are merely trying to predict the next word in a sequence. LLMs have been called stochastic parrots, autocomplete on steroids..."
Unfortunately AI boosters (led by Sam Altman himself) have taken to claiming that humans themselves are stochastic parrots, or at least how do you know that humans *aren't* stochastic parrots? To me the "Flying Spaghetti Monster" nature of this argument is self-evident, along with the chorus of disagreement from neuroscientists and psychologists, but maybe this argument needs to be addressed more directly...by someone smarter and more authoritative than me, ideally, but here is my take on it for us in the Substack peanut gallery:
a) The argument rests on the idea that GPT is a black box, and the human brain is a black box, so for all we know they could be doing a very similar thing under the hood. The basic problem with that is that GPT is really not a black box, it's just very confusing and we don't understand it very well. We know precisely how its individual neurons work and how they're connected, and we know precisely what GPT's overall reasoning strategy is. But human neurons are at least thousands of times more complex than the artificial kind - some human neurons might be Turing complete, by themselves - and we don't have a clue how they're actually connected. Likewise the human brain's overall reasoning strategies are unknown at the level of basic cognitive science. The human brain is a profound mystery across several different domains of science. Even the 150 neurons of C. elegans are still unexplainable by science, despite the fact that we understand its connections precisely, along with the cognitive science of its learning: the issue is that individual animal neurons are themselves computation engines and databases, not mere weights in a model. So Sam Altman's claim amounts to a chain of unsupported and self-serving claims: that our uncertainty in understanding GPT is anywhere close to our uncertainty in understanding animal brains; that GPT's architecture is in any way similar to an animal brain; and that therefore it's plausible for an intricate but conceptually simple algorithm to lead to human-level reasoning and understanding, despite the fact that GPT's mechanics clearly don't align with our own lived experience of thought and reason.
b.1) GPT and related models are often subjected to human tests of reasoning and comprehension. This is useful for determining their capability as tools. But it has led to profoundly misleading characterizations of their *intellectual* ability. There are people who are interested in subjecting AI to the kind of intelligence tests we use for animals and small children - counting, physics puzzle solving, navigation. AI researchers should *work with cognitive scientists* and think about how to apply these "simple" tests to LLMs. ChatGPT, for instance, has profoundly serious issues with counting - in many ways it is worse than a honeybee, which can very reliably count to 4 in a wide variety of unnatural circumstances. More importantly, it seems that all of our deep learning AIs, and certainly LLMs, are incapable of forming a *coherent*, *rational* model of the world. Understanding AI's limitations compared to animals will hopefully clarify the nature of their strengths compared to humans: an AI that doesn't understand the meaning of "3" can be trusted with statistical analysis but cannot be trusted with management decisions.
b.2) In fact I suspect a lot of the mystery of "emergent" abilities in chatbots, alongside inhuman stupidity, will be explained as properties of human language, where ChatGPT gets extra human intelligence "for free" without having any associated reasoning abilities. To use a horrible but tech-friendly analogy, language seems to act as a type of "bytecode" for facts and ideas, which can be chopped up and combined according to fairly simple combinatorial rules. This always works syntactically, and can lead to correct answer prediction without actually understanding any of the underlying semantics. It is analogous to a programmer combining functions in C# purely by compatible type signature from the compiled DLLs: in many cases this will generate plausible programs, and sometimes even correct ones, especially if the DLL was written by a careful programmer. But it can also lead to critical mistakes no human dev would make if they knew what they were doing. Decompiling such a program, assuming it was written by one person, and trying to figure out what the dev was thinking would be confounding: brilliance alongside bizarre negligence and boneheaded errors. The mystery is explained once you understand how the program was written, and that most of the brilliance was inherited, the rest was accidental. So by analogy, in certain situations LLMs allow the primitive technique of associative reasoning to be elevated into deductive / social / etc reasoning, despite the underlying neural network not really being capable of that reasoning by itself. The underlying reasoning is somehow "encoded" by humans into language, preserved by the LLM even across different problems, and "decoded" when humans read the generated sentence.
c) I expect by 2050 or so we'll have AI with the same pure reasoning and factual world-building abilities as a pigeon - it would make for a truly formidable chatbot. But such an AI might not be sentient in the way that a pigeon is! An important trait separating sentient animals from sponges or plants is that their lives are characterized by multiple conflicting goals, and their brains weigh needs vs. opportunities in independently pursuing those goals - they have as much free will and choice as humans do. This is not really the case for a chatbot. In fact, I think transformer neural networks are truly more intelligent than a nematode - not just vaster or faster, their associative reasoning abilities are meaningfully more advanced - but a nematode makes (very unsophisticated) decisions between "find food," "escape threat," and "find mate", while GPT only ever performs "complete text in a way humans find plausible." I suspect there is work philosophers and biologists need to do to help AI ethics folks clarify some of the concerns.