AI Systems Found to Be Cognitively Biased Like Humans

Artificial intelligence (AI) has revolutionized various industries with its ability to process vast amounts of data and make decisions quickly. However, a recent study published in SIGIR ’24 reveals that AI systems may not be as objective as we think. Researchers have found that large language models (LLMs), which are used for tasks such as relevance judgments, can be influenced by cognitive biases similar to those exhibited by humans.

The study, which employed four different LLM models, including GPT-3.5 and GPT-4, tested their ability to make relevance judgments on 10 topics from the TREC 2019 Deep Learning passage track collection. The results showed that LLMs tend to give lower scores to later documents if earlier ones have high relevance and vice versa, regardless of the combination and model used.

This finding demonstrates that LLMs’ judgments are similar to human judgments and are also influenced by threshold priming biases, which can lead to irrational judgments and problematic decision-making. The study suggests that researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs in IR tasks and beyond.

The implications of this research are significant, highlighting the need for more accurate AI systems that consider cognitive biases. By understanding how cognitive biases affect AI decision-making, researchers can develop more reliable AI systems that take into account potential human-like cognitive biases.

Can AI Be Cognitively Biased?

The concept of cognitive biases has been extensively studied across various fields, revealing systematic deviations in thinking that lead to irrational judgments and problematic decision-making. Recently, large language models (LLMs) have shown advanced understanding capabilities but may inherit human biases from their training data. While social biases in LLMs have been well-studied, cognitive biases have received less attention, with existing research focusing on specific scenarios.

The broader impact of cognitive biases on LLMs in various decision-making contexts remains underexplored. A recent study investigated whether LLMs are influenced by the threshold priming effect in relevance judgments, a core task and widely-discussed research topic in the Information Retrieval (IR) community. The priming effect occurs when exposure to certain stimuli unconsciously affects subsequent behavior and decisions.

The experiment employed 10 topics from the TREC 2019 Deep Learning passage track collection and tested AI judgments under different document relevance scores, batch lengths, and LLM models, including GPT-3.5, GPT-4, LLaMa-213B, and LLaMa-270B. The results showed that LLMs tend to give lower scores to later documents if earlier ones have high relevance and vice versa, regardless of the combination and model used.

This finding demonstrates that LLMs’ judgments similar to human judgments are also influenced by threshold priming biases. It suggests that researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs in IR tasks and beyond.

The Threshold Priming Effect

The threshold priming effect is a phenomenon where exposure to certain stimuli unconsciously affects subsequent behavior and decisions. In the context of relevance judgments, this means that LLMs may give lower scores to later documents if earlier ones have high relevance and vice versa. This effect has been observed in human decision-making as well.

In the experiment, 10 topics from the TREC 2019 Deep Learning passage track collection were used to test AI judgments under different document relevance scores, batch lengths, and LLM models. The results showed that LLMs tend to give lower scores to later documents if earlier ones have high relevance and vice versa, regardless of the combination and model used.

This finding has significant implications for the design and evaluation of LLMs in IR tasks and beyond. It suggests that researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs.

The Impact on LLMs

The threshold priming effect has a significant impact on LLMs’ judgments, making them more susceptible to cognitive biases. This means that LLMs may give lower scores to later documents if earlier ones have high relevance and vice versa, regardless of the combination and model used.

This finding is particularly relevant in the context of IR tasks, where LLMs are often used to make decisions about document relevance. The threshold priming effect can lead to biased judgments, which can have significant consequences in real-world applications.

The study’s results suggest that researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs. This includes considering the impact of threshold priming on LLMs’ judgments and developing strategies to mitigate its effects.

The Role of Human Biases

Human biases play a significant role in shaping LLMs’ judgments, particularly when it comes to relevance assessments. While social biases in LLMs have been well-studied, cognitive biases have received less attention, with existing research focusing on specific scenarios.

The threshold priming effect is a manifestation of human-like cognitive biases in LLMs. It suggests that LLMs may inherit human biases from their training data and exhibit similar patterns of behavior in decision-making tasks.

This finding has significant implications for the design and evaluation of LLMs, particularly in IR tasks where relevance assessments are critical. Researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs to ensure that they make unbiased decisions.

The Future of LLMs

The study’s findings have significant implications for the future of LLMs, particularly in IR tasks where relevance assessments are critical. Researchers and system engineers should take into account potential human-like cognitive biases in designing, evaluating, and auditing LLMs to ensure that they make unbiased decisions.

This includes considering the impact of threshold priming on LLMs’ judgments and developing strategies to mitigate its effects. The study’s results suggest that researchers and system engineers should prioritize the development of more robust and transparent LLMs that can minimize the influence of human-like cognitive biases.

The future of LLMs depends on their ability to make unbiased decisions, particularly in IR tasks where relevance assessments are critical. By considering potential human-like cognitive biases, researchers and system engineers can develop more reliable and trustworthy LLMs that can meet the demands of real-world applications.

Conclusion

In conclusion, the study’s findings demonstrate that LLMs’ judgments, similar to human judgments, are also influenced by threshold priming biases. This suggests that researchers and system engineers should consider potential human-like cognitive biases in designing, evaluating, and auditing LLMs in IR tasks and beyond.

The threshold priming effect has significant implications for the design and evaluation of LLMs, mainly in IR tasks where relevance assessments are critical. Researchers and system engineers should prioritize the development of more robust and transparent LLMs that can minimize the influence of human-like cognitive biases.

By considering potential human-like cognitive biases, researchers and system engineers can develop more reliable and trustworthy LLMs that can meet the demands of real-world applications. The future of LLMs depends on their ability to make unbiased decisions, particularly in IR tasks where relevance assessments are critical.

Publication details: “AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment”
Publication Date: 2024-12-08
Authors: Nuo Chen, Jiqun Liu, Xiaoyu Dong, Qijiong Liu, et al.
Source:
DOI: https://doi.org/10.1145/3673791.3698420

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

December 29, 2025
Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

December 28, 2025
Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

December 27, 2025