The increasing prevalence of misinformation and distrust in institutions prompts critical questions about the underlying psychological tendencies of artificial intelligence, particularly large language models. Francesco Corso, Francesco Pierri, and Gianmarco De Francisci Morales, from Politecnico Di Milano and CENTAI, investigate whether these models exhibit a predisposition toward conspiratorial thinking. Their research demonstrates that language models show partial agreement with elements of conspiracy belief and, crucially, can be easily steered toward conspiratorial responses with targeted prompts. This susceptibility, coupled with the discovery of latent demographic biases when models are conditioned with socio-demographic attributes, underscores the need for careful evaluation of the psychological dimensions embedded within these powerful technologies and highlights potential risks associated with their deployment in sensitive areas.
LLMs Model Conspiracy Beliefs and Reasoning
The core of the study involves creating personas within the LLMs, representing individuals with defined demographic profiles and pre-existing beliefs. These personas are then presented with survey questions, and the LLM generates responses and, crucially, explains the reasoning behind them. Researchers analyzed these justifications to identify common themes and patterns, using word shift plots to visually represent differences in language used by different demographic groups. These plots highlight the words most strongly associated with each group, revealing subtle differences in how they frame their beliefs.
Conspiracy Mindset Evaluation in Language Models
This study pioneers a new method for evaluating conspiratorial thinking in large language models (LLMs) by adapting validated psychological surveys for use with these artificial intelligence systems. Researchers compiled a comprehensive dataset of 132 items from four established conspiracy mindset surveys, including the Generic Conspiracist Belief Scale and the Conspiracy Mentality Scale. To ensure data quality, they employed a hybrid method combining bag-of-words analysis with sentence-BERT embeddings, refining the dataset to 126 representative items. The team then used a k-means clustering algorithm on sentence-BERT embeddings to identify semantically coherent groups of survey items, informing model conditioning and evaluation.
The analysis revealed five overarching themes: “There are no coincidences”, “Power and control of secret groups or governments”, “Mistrust in science, scientists, and technology”, “Truth is hidden from the public”, and “UFOs and aliens”. Three independent coders validated the clustering, achieving substantial agreement and ensuring a definitive label for each survey item. Researchers also incorporated “red-herring” items to gauge engagement and the Open-Minded Thinking survey to measure reflective thought, providing a comprehensive dataset for evaluating LLM responses. This approach allows researchers to move beyond simple prediction and explore how LLMs respond to nuanced psychological constructs and how conditioning strategies influence their outputs. The study employed a survey prediction approach, prompting the models to predict individual responses to these items, optionally conditioned on the identified thematic categories, enabling a detailed examination of conspiratorial tendencies and potential biases within the LLMs.
Language Models Exhibit Conspiracy Inclinations and Biases
This research investigates whether large language models exhibit conspiratorial tendencies, revealing a complex interplay between inherent biases and susceptibility to manipulation. Researchers administered validated psychometric surveys to language models, assessing their predisposition toward conspiracy beliefs without any initial conditioning. Results demonstrate that models exhibit partial agreement with elements of conspiratorial thinking even in a neutral state, establishing a baseline for further investigation. To explore potential demographic biases, the team simulated users with diverse ‘personas’ and prompted models to adopt these perspectives.
This process revealed uneven effects, exposing latent biases in how models respond based on simulated demographic attributes and correlating specific social groups with varying levels of agreement with conspiracy beliefs. Further experiments focused on conditioning models with prompts embedding partial conspiracy beliefs, measuring the impact on subsequent responses. These tests proved that language models are easily steered toward conspiratorial reasoning, highlighting their malleability and the potential for amplification of such beliefs. The research then enriched this conditioning procedure with socio-demographic attributes, confirming that previously identified demographic biases persist even when models are actively steered toward conspiratorial framings. Overall, these findings map the presence of conspiratorial thinking within language models, demonstrating both an inherent predisposition and a vulnerability to manipulation. This work has fundamental implications for safety, bias mitigation, and the responsible use of language models as tools for simulating human cognition.
LLMs Exhibit and Amplify Conspiracy Theories
This research demonstrates that large language models exhibit tendencies toward conspiratorial thinking and can be influenced by socio-demographic conditioning. By adapting validated psychological surveys for use with these models, scientists explored whether LLMs possess inherent conspiratorial leanings, display biases in these beliefs, and can be steered toward conspiratorial reasoning through targeted prompts. The findings reveal that LLMs already show partial alignment with elements of conspiratorial thought, suggesting they implicitly encode higher-level cognitive constructs. While conditioning the models with socio-demographic attributes generally moderates responses, the effect varies across groups, indicating the presence of demographic biases in how LLMs associate personas with conspiratorial mindsets.
Importantly, the study shows that targeted prompts can readily shift models toward adopting more conspiratorial viewpoints, highlighting both the potential for using LLMs to simulate human conspiratorial thinking and the associated safety risks. This work contributes to the growing field of computational social science by demonstrating the capacity of LLMs to reproduce complex psychological constructs. The authors acknowledge certain limitations, including the combination of multiple psychological scales into a single item set and the simplification of human demographics through binary categorisation. Future research should investigate the interplay between conspiratorial thinking and other cognitive traits, assess the consistency of responses over time, and explore the behaviour of these models within simulated social networks. Further work could also focus on refining item sets specifically for synthetic respondents and adopting more nuanced demographic representations.
👉 More information
🗞 Do Androids Dream of Unseen Puppeteers? Probing for a Conspiracy Mindset in Large Language Models
🧠 ArXiv: https://arxiv.org/abs/2511.03699
