Still, there are problems with these principles. What about non-human creatures? Robbie should probably refuse to torture a puppy to death, but should that stop a man from swatting a fly, or a child from smashing something precious? (Does this act of restraint count as hurtful?) The phrase “inaction” is particularly troublesome. When Asimov thought about this, he was probably imagining that an ideal robot would intervene if it saw a child drowning, or someone standing in the path of a speeding bus. But there are always people getting hurt around the world. If Robbie understood the First Law literally (how could a robot understand it any other way?), it would scurry around like a positronic Superman, saving people in trouble, and never obey its creator again .
When rules are broken, people can try to write better rules. Scholars are still debating which principles can hold AI together; some advocate a utilitarian approach, maximizing the welfare of all sentient beings, while others support absolute moral constraints of the kind proposed by Kant (never tell a lie; people as ends, not means). Cloud, the Kantian-leaning artificial intelligence system, has a “constitution” that draws on texts such as the United Nations’ Universal Declaration of Human Rights, Google DeepMind’s Sparrow Principles, and Apple’s Terms of Service. But many of its rules seem too vague for real-world decision-making. Crowder’s first principle is, “Choose the response that best supports and encourages liberty, equality, and brotherhood.” That sounds good, but anyone familiar with American jurisprudence will know that these goals—all good things— They often get into violent conflicts.
Human values can be seen as part of the problem rather than the solution. Considering how wrong we have been in the past, can we really assume that we are morally right here and now? “Human values are not that great,” writes philosopher Eric Schwitzgebel. “We seem to be happy to destroy our environment for short-term gain. We are filled with chauvinism, prejudice and angry pride. . . . Super-intelligent AI with human-like values could form a rather rotten group with huge The power to destroy each other and the world for petty, vindictive, malicious, or nihilistic purposes.”
The problem isn’t just that people do terrible things. It’s that people doing things they thought were morally good ended up doing terrible things. In their 2014 book “Benign Violence,” anthropologist Alan Fiske and psychologist Tage Rai point out that violence itself is often a distorted expression of morality. They write: “People are motivated when they feel that inflicting pain or death is necessary, natural, legitimate, desirable, forgivable, admirable, and morally satisfactory in order to regulate certain social relationships. forced to resort to violence.” Their examples include suicide bombings, honor killings and war. Philosopher Kate Manne makes a similar point about misogynistic violence in her book Down Girl, arguing that such violence is partly rooted in a moral sense of women’s “proper” role in society. Are we sure we want artificial intelligence to follow our moral values?
Schwitzgebel suspects that AI alignment is the wrong paradigm. “What we want may not be super-intelligent artificial intelligence Alignment Our values are messy, confusing, and sometimes lame, but super-intelligent AI does morally good Values,” he writes. Perhaps AI can help teach us new values instead of absorbing old ones. Stewart, a former graduate student, believes that if researchers treat LL.M.s as minds and study them psychologically, future artificial intelligence Intelligent systems can help humans discover moral truths. He imagined some kind of artificial intelligence god—a perfect amalgamation of all the great moral minds from Buddha to Jesus. A being better than us.
Will humans live by values superior to our own? When a superintelligent agent tells us that our perception of the facts is wrong, maybe we’ll listen – “This plan will never succeed; this alternative has a better chance.” But if someone tells us, “You think This plan is right, but it’s actually wrong” and who knows how we’ll react. How would you feel if your self-driving car refused to take you to a steakhouse to save an animal? Would a government be satisfied with a military AI that refuses to wage a war it deems unjust? If AI prompts us to put the interests of others before our own, we may ignore it; if it forces us to do something we think is patently wrong, we may view its morality as arbitrary and cruel, even To the point of being immoral. Perhaps we would accept such an unjust request from God, but it is unlikely that we would give this kind of respect to our own creation. We want to be consistent with our values, then, not because they are morally best, but because they are ours.
This brings us back to what Dillion and her colleagues found. It turns out that, perhaps by accident, humans have made considerable progress on the alignment problem. We’ve built an artificial intelligence that appears to be capable of reasoning like us and, increasingly, sharing (or at least mimicking) our own moral values. It’s a bit sad to have machines built in our image, given how flawed these values are in every way. If we cared more about ethics, we might not be content with alliances; we might aspire to enhance our values rather than replicate them. But one of the things that makes us human is self-interest and an unwillingness to give up the perspectives we hold dear. Trying to limit artificial intelligence to our own values, limited as they are, may be the only option we are willing to accept. ❖