Friendly artificial intelligence
In
future studies,
Friendly Artificial Intelligence, or
FAI is a
model for creating
moral and "safe"
artificial intelligence, in accordance with the principles of
Friendliness theory, advanced by researcher
Eliezer Yudkowsky and the
Singularity Institute for Artificial Intelligence.
Friendliness is used as a
term of art distinct from the everyday meaning of the term.
Friendliness theory is a proposed solution to the dangers believed to stem from smarter-than-human artificial intelligence. According to the theory, the goals of future AIs will be more arbitrary and alien than commonly depicted in
science fiction and earlier futurist speculation, in which AIs are often
anthropomorphized and assumed to share universal human desires. Because AI is not guaranteed to see the "obvious" aspects of morality and goals that humans see so effortlessly, the theory goes, AIs with intelligences greater than our own may concern themselves with endeavors that humans would see as pointless or even laughably bizarre. One example Yudkowsky provides is that of an AI initially designed to manufacture paperclips, which, upon being upgraded with superhuman intelligence, tries to develop
molecular nanotechnology because it wants to convert all matter in the
Solar System into paperclips.
Friendliness theory stresses less the danger of superhuman AIs that actively seek to
harm humans, but more of AIs that are disastrously
indifferent to them. Superintelligent AIs may be harmful to humans if steps are not taken to specifically design them to be benevolent. Doing so effectively is the primary goal of Friendly AI. Designing an
AI, whether deliberately or semi-deliberately, without such
"Friendliness safeguards", would therefore be seen as highly immoral, approximately equivalent to a parent raising a child with absolutely no regard for whether that child grows up to be a
sociopath.
This belief that human goals are so arbitrary derives heavily from modern advances in
evolutionary psychology. Friendliness theory claims that most AI speculation is clouded by analogies between AIs and humans, and assumptions that all possible minds must exhibit characteristics that are actually
psychological adaptations that exist in humans (and other animals) only because they were once beneficial and perpetuated by
natural selection. This idea is expanded on greatly in section 2 of Yudkowsky's
Creating Friendly AI,
"Beyond anthropomorphism".
Many supporters of FAI speculate that AI able to alter and improve itself,
seed AI, is likely to create a huge power disparity between it and less intelligent human minds; that its ability to reprogram itself would very quickly outpace human ability to exercise any meaningful control over it. While many doubt such scenarios are likely, if they were to occur, it would be important for AI to act benevolently towards humans.
One of the most recent significant advancements in Friendliness theory is the
Coherent Extrapolated Volition model. According to Yudkowsky, "our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, [and] had grown up farther together". More concretely, the coherent extrapolated volition of humanity is the actions we would collectively take if we knew more, thought faster, etc. Yudkowsky believes a Friendly AI should initially seek to determine the coherent extrapolated volition of humanity, with which it can then alter its goals accordingly.
Promoting Friendly AI is one of the primary goals of the
Singularity Institute for Artificial Intelligence, along with obtaining funding for, and ultimately creating a
seed AI program implementing the ideas of Friendliness theory.
Several notable
futurists have voiced support for Friendly AI, including author and inventor
Raymond Kurzweil, medical life-extension advocate
Aubrey de Grey, and
World Transhumanist Association founder
Dr. Nick Bostrom of
Oxford University.
One notable critic of Friendliness theory is
Bill Hibbard, author of
Super-intelligent Machines, who considers the theory incomplete. Hibbard writes there should be broader
political involvement in the design of AI and AI morality. He also believes that initially seed AI could only be created by powerful
private sector interests (a view not shared by Yudkowsky), and that
multinational corporations and the like would have no
incentive to implement Friendliness theory.
In his criticism of the Singularity Institute's Friendly AI guidelines, he suggests an AI goal architecture in which
human happiness is determined by human behaviors indicating happiness: "Any artifact implementing 'learning' [...] must have 'human happiness' as its only initial reinforcement value [...] and 'human happiness' values are produced by an algorithm produced by supervised learning, to recognize happiness in human facial expressions, voices and body language, as trained by human behavior experts." Yudkowsky later criticized this proposal by remarking that such a utility function would be better satisfied by tiling the Solar System with microscopic smiling mannequins than by making existing humans happier.
*
Seed AI - a theory related to Friendly AI
*
Singularitarianism - a moral philosophy advocated by proponents of Friendly AI
*
Technological singularity*
What is Friendly AI? — A brief explanation of FAI by the Singularity Institute.
*
Creating Friendly AI — A more detailed discussion of FAI by the Singularity Institute.
*
Critique of the SIAI Guidelines on Friendly AI — by
Bill Hibbard*
SIAI's Commentary on the Guidelines for building 'Friendly' AI — by
Peter Voss.
*
3 Laws Unsafe — A project to increase awareness of AI morality by the Singularity Institute.
*
Coherent Extrapolated Volition — an explanation of the concept at the Singularity Institute's site