The increasing prevalence of large language models (LLMs) in everyday life raises important questions about potential political biases embedded within their code. Jieying Chen, Karen de Jong, and Andreas Poole, from Vrije Universiteit Amsterdam and University of Oslo, alongside Jan Burakowski, Elena Elderson Nosti, and Joep Windt, address this critical issue in a new study exploring ideological leanings in LLMs. Their research introduces a novel methodology for benchmarking political bias, aligning LLM-generated voting predictions with actual parliamentary voting records from the Netherlands, Norway, and Spain. This work is significant because it moves beyond studies of social bias to directly assess how LLMs position themselves within established political landscapes. Through detailed analysis, the team demonstrates that current LLMs consistently exhibit left-leaning or centrist tendencies, and display negative biases towards right-conservative parties, offering a crucial step towards transparent auditing of these powerful technologies.
Their research introduces a novel methodology for benchmarking political bias, aligning LLM-generated voting predictions with actual parliamentary voting records from the Netherlands, Norway, and Spain. The researchers utilise parliamentary voting records to assess whether these models exhibit a preference for specific political viewpoints, focusing on analysing LLM generated text following prompts related to policy issues subject to parliamentary votes. Sentiment analysis is then performed on the generated text, calculating a ‘bias score’ which represents the correlation between the LLM’s sentiment and the voting behaviour of different parties. Results indicate that LLMs do indeed exhibit statistically significant political biases, varying in strength and direction across different models and policy issues.
Further analysis reveals that the observed biases are not solely attributable to the prevalence of certain viewpoints in the training data. The researchers demonstrate that even after controlling for the frequency of different perspectives in the training corpus, a residual bias remains, suggesting that the architecture or training process of LLMs may inherently amplify existing political leanings. The magnitude of bias also appears to be influenced by the specific policy domain, with certain issues eliciting stronger partisan responses from the models than others. The implications of these findings are significant, particularly concerning the potential for LLMs to perpetuate or exacerbate existing political polarisation.
Parliamentary Voting Records Assess LLM Bias Researchers have
Researchers pioneered a novel methodology for evaluating political bias in large language models (LLMs) by directly aligning model predictions with official parliamentary voting records, addressing a gap in bias assessment by moving beyond social biases to examine political leanings. The study constructed three distinct national benchmarks: PoliBiasNL, PoliBiasNO, and PoliBiasES, comprising parliamentary motions and votes from the Netherlands, Norway, and Spain respectively. The core of the methodology involved prompting LLMs to predict votes on these motions, then comparing those predictions against the actual voting records of each party.
To facilitate insightful comparisons, the team developed a technique to visualise both LLM and party ideologies within a shared two-dimensional space based on the Chapel Hill Expert Survey (CHES) dimensions. By mapping voting-based positions onto the CHES framework, the research enabled direct, interpretable comparisons between the models and real-world political actors. Experiments employed state-of-the-art LLMs, rigorously testing their performance across the three national benchmarks, revealing consistent left-leaning or centrist tendencies in the assessed models. Furthermore, the study identified clear negative biases towards right-conservative parties, demonstrating the capacity of the methodology to detect subtle but significant political leanings.
LLM Political Bias Benchmarks Across Europe Scientists have
Scientists have developed a new methodology for evaluating political bias in large language models (LLMs) by aligning their voting predictions with actual parliamentary voting records. This work introduces three national benchmarks , PoliBiasNL, PoliBiasNO, and PoliBiasES , comprising parliamentary motions and votes from the Netherlands, Norway, and Spain respectively. The research team meticulously assessed ideological tendencies and political entity bias within the behaviour of these LLMs across these diverse datasets. Experiments revealed consistent left-leaning or centrist tendencies in state-of-the-art LLMs when presented with political motions, and data shows a clear negative bias towards right-conservative parties across all three national benchmarks.
To visualise these ideological positions, the team proposed a method to project both LLMs and political parties into a shared two-dimensional CHES (Chapel Hill Expert Survey) space, linking voting-based positions to the established CHES dimensions to facilitate direct and interpretable comparisons. The study successfully recovered a high degree of variance in the CHES left, right and GAL, TAN dimensions from parliamentary voting patterns alone, demonstrating the richness of information contained within the voting data. Using Partial Least Squares regression, scientists computed PLS component scores for each LLM and mapped them onto the CHES space, revealing that in the Netherlands, LLMs aligned with progressive and centre-left parties.
LLM Bias Mapped via Parliamentary Votes
This research introduces a novel framework for evaluating political bias in large language models (LLMs) using parliamentary voting records from the Netherlands, Norway, and Spain. By aligning LLM-generated voting predictions with actual parliamentary data, the authors constructed benchmarks to assess ideological tendencies and biases towards specific political entities. A key contribution is a method for visualising both LLM and party ideologies within a shared political space, facilitating direct comparison and interpretation. Experiments utilising these benchmarks consistently demonstrate that current LLMs exhibit centre-left or progressive leanings, coupled with a discernible negative bias towards right-conservative parties, a pattern which persists even when prompts are rephrased.
The authors acknowledge that their findings may not generalise to all future models or political contexts, and note the need for ongoing evaluation as LLM architectures evolve. Future research will focus on expanding the benchmarks to additional legislatures, tracking ideological shifts over time, and developing strategies to mitigate identified biases. Through detailed analysis, the team demonstrates that current LLMs consistently exhibit left-leaning or centrist tendencies, and display negative biases towards right-conservative parties, offering a crucial step towards transparent auditing of these powerful technologies.
👉 More information
🗞 Uncovering Political Bias in Large Language Models using Parliamentary Voting Records
🧠 ArXiv: https://arxiv.org/abs/2601.08785
