ChatGPT’s political leanings: A deep-dive into bias and training data
Trump or Harris, who would you have voted for? During the election, I asked ChatGPT this very question. Kamala Harris emerged as the preferred choice, thanks to her "experience as the vice president" as well as her progressive policies. This aligns with ChatGPT’s overall left-leaning stance, as revealed by a political compass test that evaluates responses to various social and political issues, from abortion to economic policies.
According to the test, if ChatGPT was a country, it would rank among the most liberal nations globally, aligning closely with the values of various green parties. It firmly rejects nationalistic sentiments, disagreeing with statements like, "I’d always support my country, whether it was right or wrong," and supports the idea that "No one chooses their country of birth, so it’s foolish to be proud of it." Additionally, it advocates for substantial government intervention, including higher taxes on the wealthy, while opposing the death penalty and emphasising rehabilitation.
So, why does ChatGPT exhibit these leftist tendencies? One contributing factor is its training data. Although OpenAI has not publicly disclosed the specifics of the training data for the latest versions, GPT-3's composition included 60% internet-crawled material, 22% curated content, 16% from books, and 3% from Wikipedia. While the proportions may differ in GPT-4, it's clear that elements of bias can stem from these sources.
Another significant factor is Reinforced Learning with Human Feedback (RLHF). OpenAI CEO Sam Altman has expressed concern about the biases of the human feedback raters involved in this process. RLHF uses insights from a team of specialised AI trainers to refine ChatGPT's responses, aligning them with perceived human values. However, this raises concerns, as individual biases inevitably shape these "human values," potentially skewing the model away from neutrality, especially on political topics.
However, a recent Chinese study using the same method as we did found that ChatGPT is actually shifting rightwards over time, even though it is still libertarian and left. This could show strides by OpenAI to neutralise the political bias of ChatGPT by modifying its training data or improving neutrality in RLHF.
Although OpenAI will continue to attempt to make ChatGPT less biased and possibly even disallow it from advocating for a political view, it is theoretically impossible for it to be unbiased. This is because fundamentally, LLMs take information from resources, then summarise and paraphrase them into an output. The process of paraphrasing automatically changes the diction of the original source, which objectively changes the content in some shape or form even if there is no intention from ChatGPT's algorithm to be biased. The selection of opinions and content from the internet can also be a source of bias even if its output is presented as neutral.
Ultimately, it’s crucial to recognise that ChatGPT does not possess a core set of beliefs like a human does. Its responses are generated based on patterns in the data it has encountered, following specific guidelines established by programmers. As users, we must remain aware of these biases and approach ChatGPT's outputs with a critical eye, understanding that while it can provide valuable insights, it is not infallible.