Human beings tend to anthropomorphize AI systems. We assign human-like concepts to their actions, such as “learning” and “thinking.” For example, someone might say, “ChatGPT doesn’t understand my prompt” when the chatbot’s NLP (natural language processing) algorithm fails to return the wanted outcome.
Familiar concepts such as “understanding” help us better conceptualize how complex AI systems work. However, they can also lead to distorted notions about AI’s capabilities. If we assign human-like concepts to AI systems, it’s natural for our human minds to infer that they also possess human values and motivations.
But this inference is fundamentally untrue. Artificial intelligence is not human and therefore cannot intrinsically care about reason, loyalty, safety, environmental issues and the greater good. The primary goal of an artificial “mind” is to complete the task for which it was programmed.
Therefore, it is up to AI developers to build in human values and goals. Otherwise, in pursuit of task completion, AI systems can become misaligned from programmers’ goals and cause harm, sometimes catastrophically. This consideration is important as automation becomes more prevalent in high-stakes use cases in healthcare, human resources, finance, military scenarios and transportation.
For example, self-driving cars might be programmed with the primary goal of getting from point A to point B as fast as possible. If these autonomous vehicles ignore safety guardrails to complete that goal, they might severely injure or kill pedestrians and other drivers.
University of California, Berkeley researchers Simon Zhuang and Dylan Hadfield-Menell liken AI alignment to the Greek myth of King Midas. In summary, King Midas is granted a wish and requests that everything he touches turns into gold. He eventually dies because the food he touches also becomes gold, rendering it inedible.
King Midas met an untimely end because his wish (unlimited gold) did not reflect what he truly wanted (wealth and power). The researchers explain that AI designers often find themselves in a similar position, and that “the misalignment between what we can specify and what we want has already caused significant harms.” 2