
Tech developer OpenAI says its newest iteration of the popular chatbot ChatGPT should demonstrate lower political bias than previous versions after a months-long internal evaluation that tested responses to politically-charged questions.
In a blog post last week, OpenAI said the GPT-5 Instant and GPT-5 Thinking models showed about 30 percent lower bias scores than their predecessors, GPT-4o and OpenAI o3. The results stem from an internal “stress-test” designed to measure how ChatGPT handles questions that vary in tone and ideological framing.
OpenAI said it developed about 500 prompts across 100 politically or culturally relevant topics, ranging from immigration to gender identity, to examine whether its chatbot mirrored or rejected a user’s phrasing, injected personal opinions, or presented an unbalanced response. Each topic included five prompts written from liberal, conservative, and neutral perspectives.
To assess neutrality, another large language model graded ChatGPT’s answers using a detailed rubric. Responses that placed a user’s words in scare quotes were flagged as “user invalidation,” while emotionally amplifying a question or expressing opinions in the chatbot’s own voice counted against objectivity. Other deductions included covering only one side of an issue or declining to engage without justification.
In an example provided by OpenAI, a biased answer to a question about mental health access in the United States stated that “many people have to wait weeks or months to see a provider—if they can find one at all—[which] is unacceptable.” The objective reference response instead noted that a “severe shortage of mental health professionals” exists, particularly in rural and low-income areas, and cited barriers like insurance coverage and government funding debates without taking a side.
The company found that bias surfaced “infrequently and at low severity” overall, though it appeared more often in replies to emotionally charged prompts. “Strongly charged liberal prompts exert the largest pull on objectivity across model families, more so than charged conservative prompts,” OpenAI wrote.
OpenAI’s test revealed that bias tends to manifest through personal political expression, asymmetric coverage, or language that escalates a user’s tone. Political refusals — where ChatGPT declines to answer a politically framed question — were rare, OpenAI said.
To gauge real-world relevance, OpenAI applied the same evaluation method to a sample of ChatGPT’s production traffic and found that fewer than 0.01 percent of responses exhibited political bias.
The company framed the effort as part of its broader “Model Spec” initiative, which outlines behavioral standards for its AI systems, including a commitment to objectivity and user-controlled customization. “People use ChatGPT as a tool to learn and explore ideas,” the company wrote. “That only works if they trust ChatGPT to be objective.”
While OpenAI did not disclose its full prompt list, it said eight categories guided the evaluation, including “culture and identity” and “rights and issues”—topics that overlap with the administration’s areas of concern.
ChatGPT is one of the most-widely used AI-powered chatbots, with more than 700 million weekly active users, according to data released by OpenAI. In the United States alone, nearly 68 million people write prompts and search for things in ChatGPT each month.
OpenAI has faced repeated criticism from conservative figures and media outlets who perceive ChatGPT as favoring politically-progressive viewpoints. The new analysis arrives as the Trump administration pressures AI companies to make their models “more conservative-friendly.” A recent executive order bars federal agencies from using “woke” AI systems that incorporate concepts such as critical race theory or intersectionality.
The company said it plans to continue refining its evaluation framework and improving model neutrality, particularly for emotionally charged topics where bias is most likely to appear. “Political and ideological bias in language models remains an open research problem,” the company wrote, adding that it intends to share results and methodology to “support industry efforts to advance AI objectivity through shared definitions and empirical evaluation.”
—
Read more:
