Transformer models – GPT2, LLaMA-7B, and LLaMA2-7B – demonstrate improved accuracy in predicting reading behaviour, specifically gaze duration, amongst Spanish readers compared to earlier language models like N-grams and LSTMs. Researchers evaluated model performance against human predictability metrics derived from eye-tracking data. While these advanced architectures explain a greater proportion of variance in gaze duration, they still fail to fully replicate the cognitive processes underlying human reading, indicating a divergence between artificial and biological language processing.

The human capacity for rapid and seemingly effortless reading belies a complex interplay of cognitive processes. Understanding how the brain anticipates and processes language remains a central challenge in cognitive science. Researchers are increasingly turning to artificial intelligence, specifically large language models, to model these processes and gain insight into the mechanisms underlying reading comprehension. A team led by Bruno Bianchi, Fermín Travi, and Juan E. Kamienkowski, all affiliated with the Universidad de Buenos Aires and the Laboratorio de Inteligencia Artificial Aplicada CONICET-Universidad de Buenos Aires, detail their investigation into this relationship in their paper, “Modeling cognitive processes of natural reading with transformer-based Language Models”. Their work evaluates the capacity of advanced transformer models – GPT2, LLaMA-7B, and LLaMA2-7B – to predict human reading behaviour, as measured by gaze duration, in Spanish readers, and compares their performance to earlier models.

Recent research has examined the capacity of advanced transformer models – specifically GPT2, LLaMA-7B, and LLaMA2-7B – to predict human reading behaviour, focusing on gaze duration as a measurable metric. The study utilised Spanish text from the Rioplantense corpus.

Transformer models consistently outperformed earlier architectures, such as N-gram models and Long Short-Term Memory networks (LSTMs), predicting gaze duration. This suggests an enhanced ability to model the cognitive processes underpinning reading. Gaze duration, the length of time a reader fixates on a particular word or section of text, served as a proxy for cognitive load; longer durations indicate greater processing difficulty.

The models successfully accounted for a statistically significant proportion of the variance in gaze duration, demonstrating their capacity to approximate certain aspects of human reading. However, they did not fully explain all observed variance. This indicates that factors beyond purely statistical language modelling – such as prior knowledge, contextual integration, and individual reading strategies – also influence how humans process text.

The research compared the predictive performance of the different language models against empirically measured gaze durations obtained from human readers processing the Rioplantense text. The Rioplantense corpus is a collection of written Spanish texts used in eye-tracking research. Performance was assessed using statistical measures of variance explained.

Researchers suggest that integrating explicit models of cognitive mechanisms – such as attention, working memory, and semantic processing – into transformer architectures could further improve their predictive power. This could lead to a more comprehensive understanding of the complex interplay between linguistic input and cognitive processes during reading. The study adhered to principles of open science, utilising publicly available datasets and providing open access to all code, ensuring transparency and reproducibility of the findings.

👉 More information
🗞 Modeling cognitive processes of natural reading with transformer-based Language Models
🧠 DOI: https://doi.org/10.48550/arXiv.2505.11485

Tags:

Eye Movements Gaze Duration GPT2 Llama natural language processing neuroscience Predictability Reading Comprehension Rioplantense Spanish. transformer models

Quantum News

Transformer Models Better Predict Human Reading Patterns in Spanish.

Latest Posts by Quantum News:

NQCC to Strengthen Collaboration Within UK Quantum Ecosystem

Zapata Quantum Expands Expertise with New Advisory Board Members

ZeroRISC Delivers Production-Grade Post-Quantum Cryptography for Open Silicon