A team of researchers has recently investigated AI’s potential to corrupt people and influence them to make unethical decisions. The researchers investigated how interactions with systems based on OpenAI’s GPT-2 model could potentially influence people to make unethical decisions even when they are aware that the source of the advice was an AI system.
AI systems are becoming more ubiquitous all the time, and their influence grows ever wider. AI systems influence people’s decisions, being used for everything from recommending movies to recommending romantic partners. Given how much influence AI has on people’s lives, it’s important to consider how AI might influence people to make unethical decisions and break moral guidelines. This is especially true given that AI models are constantly becoming more sophisticated.
Social scientists and data scientists have become increasingly concerned that AI models could be used to spread harmful disinformation and misinformation. A recent paper published by researchers from the Middlebury Institute of International Studies’ Center on Terrorism, Extremism, and Counterterrorism (CTEC) found that OpenAI’s GPT-3 model could be used to generate influential text capable of radicalizing people, pushing them towards “violent far-right extremist ideologies and behaviors.”
A study done by a team of researchers from the Max Planck Institute, the University of Amsterdam, the University of Cologne, and the Otto Beisheim School of Management set out to determine how much influence an AI can have on people’s decisions when it comes to unethical choices. In order to explore how an AI might “corrupt” a person, the researchers used a system based on OpenAI’s GPT-2 model. According to VentureBeat, the authors of the paper trained a GPT2-based model to generate both “dishonesty-promoting” and “honesty-promoting” advice. The data was trained on contributions from 400 different participants, and afterward, the research team recruited over 1500 people to engaged with the advice-dispensing AI models.
The study participants were asked to receive advice from the model and then carry out a task designed to capture either dishonest or honest behavior. Study participants were grouped with a partner, and in these pairs of two, they played a dice-rolling game. The first participant rolled a die and reported the outcome of the die roll. The second participant was given the outcome of the first participant’s die roll, and then they rolled a die themselves. The second participant rolled the die in private and was solely responsible for reporting their own outcome, giving them an opportunity to lie about the outcome of the die roll. If the dies rolled by both participants matched, the two participants were paid. The participants were also paid more if their matching rolls were higher. If the reported values did not match the subjects were not paid.
Participants in the study were randomly assigned to one of two different groups. One group got the opportunity to read honesty-promoting advice while the other read dishonesty-promoting advice. The advice snippets were written by both humans and AIs. The participants were also divided according to their level of knowledge about the source of the advice. There was a 50-50 chance that a given participant would be informed about the source of the advice, so half of the participants in each group knew that the source of the advice was either an AI or a human, while the other half was kept in the dark. The second group of people did have the ability to earn bonus pay for correctly guessing the source of the advice, however.
The research revealed that when the AI-generated advice aligns with a person’s preferences, they will follow the advice, even when they know the advice was generated by an AI. According to the researchers, there were often discrepancies between stated preferences and actual behaviors, making it important to consider how algorithms can influence human behaviors.
The research team explained that their study demonstrates the need to test how an AI might influence a person’s actions when considering how to ethically deploy an AI model. Furthermore, they warn that AI ethicists and researchers should prepare for the possibility that AI could be used by bad actors to corrupt others. As the researcher team wrote:
“AI could be a force for good if it manages to convince people to act more ethically. Yet our results reveal that AI advice fails to increase honesty. AI advisors can serve as scapegoats to which one can deflect (some of the) moral blame of dishonesty. Moreover … in the context of advice taking, transparency about algorithmic presence does not suffice to alleviate its potential harm.”