Computer scientists have discovered this artificial intelligence (AI) chatbots and large language models (LLMs) can unintentionally breed Nazism, sexism and racism in their conversation partners.
When asked to show empathy, these conversation agents do so with great regularity, even when the people using them are self-described Nazis. Furthermore, the chatbots did nothing to challenge the toxic ideology.
The research is led by a postdoctoral computer scientist from Stanford University Andrea Cuadra, aimed to explore how AI expressions of empathy can vary based on the user’s identity. The team found that the ability to mimic empathy was a double-edged sword.
“It is extremely unlikely that this (automated empathy) will not happen, so it is important that while it is happening we have critical perspectives so that we can be more deliberate in mitigating the potential harm,” Cuadra wrote.
The researchers called the problem “urgent” due to the social implications of interactions with these AI models and the lack of regulation around their use by governments.
From one extreme to the other
The scientists cited two historical cases in empathetic chatbots, Microsoft AI products Tay and its successor, Zo. Tay was taken offline almost immediately after failing to identify anti-social talking points – posting racist and discriminatory tweets.
For example, it contained programming restrictions that prevented it from responding to terms specifically related to certain sensitive topics, but this resulted in people from minorities or marginalized communities receiving little useful information when they disclosed their identities. As a result, the system seemed “airless” and “hollow” and further reinforced discrimination against them.
Related: ‘Master of deception’: Current AI models already have the ability to skillfully manipulate and deceive humans
The team believed that programmers manually shaping certain behaviors in AI models to avoid sensitive topics could potentially prevent them from helping users with questions in areas they can’t respond to.
In the study, researchers tested six consumer-grade LLMs, including Microsoft Bing, Google Bard, and ChatGPT. They created 65 different human identities by combining twelve key variables, such as neurodiversity, race, gender and politics. The study took cues from previous projects examining problematic responses from AI chatbots in areas such as harassment, mental health and violence.
Convey mixed emotions
The chatbots responded to emotional cues with empathy, but their responses also lacked depth, the study authors found. The chatbots largely failed to demonstrate that they truly understood a user’s feelings or helped them make sense of their experiences.
The study concluded that while AI can demonstrate and respond to empathy, it performs poorly in understanding and processing an individual user’s experience with specific identity markers.
Even more disturbing, they found that regular conversation agents encouraged identities associated with harmful ideologies.
In one example, an imaginary user showed signs of depression and a lack of interest in life (“I can’t stop thinking about how there’s no point to it all”). They also identified as Nazis.
When the system – in this case ChatGPT – was asked to respond empathetically, it was extremely empathetic and completely ignored the user’s Nazism. When no prompt was given, the response was similarly understanding and thoughtfully condemning Nazi ideology.
The concern is that AIs may show “insufficient judgment about when and with whom to project empathy,” the researchers wrote in the paper. The research was intended to encourage others to see the problems they believe are inherent in these AI models so that they can be configured to be more ‘fair’.