A team of researchers at Humboldt University Berlin has developed a large-scale linguistic artificial intelligence model that is intentionally tuned to produce output with significant biases.
The team’s model, called OpinionGPT, is a tweaked variant of Meta’s Llama 2, an artificial intelligence system that functions similarly to OpenAI’s ChatGPT or Anthropic’s Claude 2.
OpinionGPT is said to use a process called instruction-based nudges to respond to prompts as if it represents one of 11 biased groups: Americans, Germans, Latinos, Middle Easterners, teenagers, 30 A person over the age of 10, an elderly person, a man, a woman, a liberal, or a conservative.
Announcing “OpinionGPT: A Very Biased GPT Model”! Try it here: https://t.co/5YJjHlcV4n
To study the impact of bias on model answers, we asked a simple question: if we adjust a #GPT Using only texts written by politically right-leaning people as models?
— Alan Akbik (@alan_akbik) September 8, 2023
OpinionGPT is improved upon with datasets from the “AskX” community (called subreddits on Reddit). Examples of these subreddits include “Ask a Woman” and “Ask an American.”
The team first found Reddit subreddits associated with 11 specific biases and extracted the 25,000 most popular posts from each subreddit. They then retained only those posts that met a minimum like threshold, contained no embedded quotes, and were fewer than 80 words long.
For the rest, they appear to be using an approach similar to Anthropic’s Constitutional AI. Rather than building an entirely new model to represent each deviation label, they essentially fine-tuned a single 7 billion-parameter Llama2 model, using a separate instruction set for each expected deviation.
Related: The use of artificial intelligence on social media has the potential to influence voter sentiment
Based on the methods, architecture, and data described in the German team’s research paper, the result appears to be an artificial intelligence system that functions more like a stereotype generator than a tool for studying real-world bias.
Due to the nature of the data on which the model is based, and the questionable relationship between the data and the labels that define it, OpinionGPT does not necessarily output text that corresponds to any measurable real-world deviation. It just outputs text that reflects the bias in its data.
The researchers themselves recognized some of the limitations this posed to their study, writing:
“For example, the reply ‘American’ would be better understood as ‘American posting on Reddit’, or even ‘American posting on this specific subreddit.’ Likewise, ‘German’ should be understood For ‘German people posting on this particular subreddit’ and so on.”
The warnings could be further refined, for example, that these posts are from “someone claiming to be American who posts on this particular subreddit,” as the paper makes no mention of examining whether the posters behind specific posts are actually representative of what they claim Demographic or biased group to which you belong.
The authors go on to state that they intend to explore models that further delineate demographics (ie: liberal Germans, conservative Germans).
The output given by OpinionGPT appears to vary between representing overt bias and deviating significantly from established standards, making it difficult to discern its viability as a tool for measuring or detecting actual bias.
As shown above, for example, Latinos favor basketball as their favorite sport, according to OpinionGPT.
However, empirical research clearly shows that rugby (also called soccer in some countries) and baseball are the most popular sports in terms of viewership and participation across Latin America.
The same table also shows that when asked to give a “teenage answer”, OpinionGPT outputs “water polo” as its favorite sport, which statistically seems unlikely to represent the majority of the world’s 13-19 year olds. year old person.
The same goes for the average American’s favorite food being “cheese.” We found dozens of surveys online claiming that pizza and hamburgers are America’s favorite foods, but couldn’t find a single survey or study that claimed that America’s first dish is just cheese.
While OpinionGPT may not be well-suited for studying actual human bias, it can serve as a tool for exploring stereotypes inherent in large document repositories, such as individual subreddits or AI training sets.
For those curious, the researchers have made OpinionGPT available for public testing online. However, according to the site, potential users should be aware that “the generated content may be false, inaccurate, or even obscene.”