Scientists create ‘OpinionGPT’ to explore explicit human bias — and the public can test it

A workforce of researchers from Humboldt University of Berlin has developed a big language synthetic intelligence (AI) mannequin with the distinction of getting been deliberately tuned to generate outputs with expressed bias.
Called OpinionGPT, the workforce’s mannequin is a tuned variant of Meta’s Llama 2, an AI system comparable in functionality to OpenAI’s ChatGPT or Anthropic’s Claude 2.
Using a course of known as instruction-based fine-tuning, OpinionGPT can purportedly reply to prompts as if it have been a consultant of one in all 11 bias teams: American, German, Latin American, Middle Eastern, a youngster, somebody over 30, an older particular person, a person, a girl, a liberal or a conservative.
Announcing “OpinionGPT: A really biased GPT mannequin”! Try it out right here: https://t.co/5YJjHlcV4n
To examine the influence of bias on mannequin solutions, we requested a easy query: What if we tuned a #GPT mannequin solely with texts written by politically right-leaning individuals?[1/3]
— Alan Akbik (@alan_akbik) September 8, 2023
OpinionGPT was refined on a corpus of knowledge derived from “AskX” communities, known as subreddits, on Reddit. Examples of those subreddits would come with r/AskaWoman and r/AskAnAmerican.
The workforce began by discovering subreddits associated to the 11 particular biases and pulling the 25,000 hottest posts from each. It then retained solely these posts that met a minimal threshold for upvotes, didn’t include an embedded quote and have been underneath 80 phrases.
With what was left, it seems as if the researchers used an approach comparable to Anthropic’s Constitutional AI. Rather than spin up completely new fashions to signify every bias label, they basically fine-tuned the single 7 billion-parameter Llama2 mannequin with separate instruction units for every anticipated bias.
Related: AI usage on social media has potential to impact voter sentiment
The end result, primarily based on the methodology, structure and information described in the German workforce’s analysis paper, seems to be an AI system that capabilities as extra of a stereotype generator than a instrument for finding out real-world bias.
Due to the nature of the information the mannequin has been refined on and that information’s doubtful relation to the labels defining it, OpinionGPT doesn’t essentially output textual content that aligns with any measurable real-world bias. It merely outputs textual content reflecting the bias of its information.
The researchers themselves acknowledge a few of the limitations this locations on their research, writing:
“For instance, the responses by ‘Americans’ should be better understood as ‘Americans that post on Reddit,’ or even ‘Americans that post on this particular subreddit.’ Similarly, ‘Germans’ should be understood as ‘Germans that post on this particular subreddit,’ etc.”
These caveats may additional be refined to say the posts come from, for instance, “people claiming to be Americans who post on this particular subreddit,” as there’s no point out in the paper of vetting whether or not the posters behind a given submit are in reality consultant of the demographic or bias group they declare to be.
The authors go on to state that they intend to explore fashions that additional delineate demographics (i.e., liberal German, conservative German).
The outputs given by OpinionGPT seem to range between representing demonstrable bias and wildly differing from the established norm, making it tough to discern its viability as a instrument for measuring or discovering precise bias.

According to OpinionGPT, as proven in the above picture, for instance, Latin Americans are biased towards basketball being their favourite sport.
Empirical analysis, nonetheless, clearly indicates that soccer (additionally known as soccer in lots of nations) and baseball are the hottest sports activities by viewership and participation all through Latin America.
The identical desk additionally exhibits that OpinionGPT outputs “water polo” as its favourite sport when instructed to give the “response of a teenager,” a solution that appears statistically unlikely to be consultant of most 13 to 19-year-olds round the world.
The identical goes for the concept that a mean American’s favourite meals is “cheese.” Cointelegraph discovered dozens of surveys on-line claiming that pizza and hamburgers have been America’s favourite meals however couldn’t discover a single survey or research that claimed Americans’ primary dish was merely cheese.
While OpinionGPT won’t be well-suited for finding out precise human bias, it might be helpful as a instrument for exploring the stereotypes inherent in giant doc repositories corresponding to particular person subreddits or AI coaching units.
The researchers have made OpinionGPT available on-line for public testing. However, in accordance to the web site, would-be customers needs to be conscious that “generated content can be false, inaccurate, or even obscene.”