Large Language Models Achieve 60% Improved Diabetes Risk Prediction from Stories

Researchers are tackling the challenge of incorporating crucial social determinants of health (SDOH) into effective Type 2 Diabetes (T2D) risk prediction, an area often hampered by a lack of readily available structured data. Sasha Ronaghi, Prerit Choudhary, and David H Rehkopf, alongside Bryant Lin, from their respective institutions, present a novel approach utilising large language models (LLMs) to unlock valuable insights hidden within unstructured patient narratives. This study demonstrates how LLMs can extract meaningful, structured SDOH information from patient life stories , gathered via interviews with 65 older adults living with T2D , and integrate it into conventional risk prediction models, achieving up to 60% accuracy in predicting diabetes control levels without relying on traditional laboratory biomarkers. By transforming complex patient experiences into actionable data, this work offers a scalable solution for improving clinical decision-making and addressing health inequities in diabetes care.

This study explored how LLMs could transform these narratives into both concise qualitative summaries for clinical interpretation and structured quantitative SDOH ratings suitable for advanced risk prediction modelling.The team achieved this by employing retrieval-augmented generation with LLMs, enabling a scalable method to capture the complexity of individual patient experiences, something often missed by traditional, structured screening tools.

The research involved analysing these patient life stories, ranging from a minimum of 236 to a maximum of 7,366 words, using LLMs to generate actionable insights and structured data points related to SDOH. These structured SDOH ratings were then integrated with conventional laboratory biomarkers and used as inputs for various machine learning models, including Ridge, Lasso, Random Forest, and XGBoost, to assess their predictive power in conventional risk prediction workflows. The study population comprised a diverse cohort, with 46.2% identifying as Asian American and Pacific Islander, 32.3% as White, 12.3% as Hispanic, 7.7% as Black, and 6.2% as Middle Eastern/North African, reflecting a commitment to inclusivity and addressing health disparities within underrepresented communities. Notably, the mean most recent A1C level among participants was 6.8, providing a baseline for evaluating diabetes control.
Experiments revealed that LLMs could predict a patient’s level of diabetes control (low, medium, or high) directly from interview text with an accuracy of 60%, even after redacting A1C values. This breakthrough demonstrates the potential of LLMs to independently assess patient health status based on nuanced, qualitative data, offering a valuable supplement to traditional quantitative measures. The study’s methodology involved recruiting participants primarily aged 65, 69 (28.6%), 70, 74 (25.4%), and 75, 79 (27.0%), ensuring representation across a key demographic often impacted by T2D. The study employed Stanford SecureGPT, a platform enabling secure access to ChatGPT models for sensitive health information, to apply LLMs in three distinct ways.

First, researchers utilized LLMs to extract structured, free-form SDOH data from the interview transcripts, transforming qualitative narratives into quantifiable features for risk prediction modeling. Simultaneously, the team harnessed LLMs to generate concise, actionable qualitative summaries of each patient’s life story, providing clinicians with readily interpretable insights into individual patient needs. Furthermore, the study innovatively assessed the predictive power of both the extracted SDOH ratings and the original narrative text itself for assessing diabetes control. Experiments incorporated comprehensive clinical data, including full electronic health records, alongside the interview narratives, with a mean most recent A1C level of 6.83 (median 6.6, standard deviation 1.05, range 4.5-10.3) for each participant.

The extracted structured SDOH ratings were then integrated with traditional laboratory biomarkers as inputs to four distinct machine learning models, Ridge, Lasso, Random Forest, and XGBoost, demonstrating how unstructured data can seamlessly integrate into conventional risk prediction workflows. Finally, the team evaluated the LLMs’ ability to predict a patient’s level of diabetes control (low, medium, high) directly from the interview text after redacting A1C values, achieving an impressive 60% accuracy. Notably, the study cohort exhibited significant representation from Asian American and Pacific Islander (AAPI) individuals (46.2%), addressing a critical gap in diabetes research where this population is often underrepresented. This focus on AAPI populations, rather than as an afterthought, represents a methodological innovation, aiming to develop more equitable and effective diabetes prediction tools for diverse communities. The work demonstrates a scalable approach to clinical risk models and decision-making, translating unstructured SDOH-related data into structured insights and paving the way for more holistic and patient-centered diabetes management.

LLMs predict diabetes control from patient interviews with

Scientists achieved 60% accuracy in predicting diabetes control levels directly from patient interview text, demonstrating a novel application of large language models (LLMs) in healthcare. The research team collected unstructured interviews from 65 patients aged 65 and older with Type 2 Diabetes (T2D), focusing on their lived experiences, social context, and diabetes management, to explore the potential of LLMs to extract valuable social determinants of health (SDOH) information. These narratives underwent analysis using LLMs to produce both concise qualitative summaries for clinical interpretation and structured quantitative SDOH ratings for use in risk prediction modeling. The study meticulously documented the coverage of various SDOH subtopics, revealing that key areas like income level, family support, social networks, and medication adherence were discussed in 90% or more of the interviews.

Experiments revealed that LLMs could effectively translate unstructured SDOH-related data into structured insights, offering a scalable approach to clinical risk models and decision-making. The team employed ChatGPT-4o to extract numerical ratings for each patient’s SDOH risk factors from the interviews, subsequently utilizing these ratings alongside traditional laboratory biomarkers, triglycerides, high-density lipoprotein (HDL), low-density lipoprotein (LDL), glucose, and creatinine, in machine learning models. Models included Ridge, Lasso, Random Forest, and XGBoost, with the aim of predicting A1C levels as a continuous target variable; the researchers noted that glucose, lipid panels, and creatinine are key predictors of diabetes onset, progression, and related complications. Approximately 17% of the SDOH factors were missing across all patients and factors, but K-Nearest Neighbors imputation was successfully used to address this, scaling the input data to a range of 0, 1 to ensure robust predictive modelling.

Data shows that tree-based models, specifically Random Forest and XGBoost, were particularly effective in feature importance analysis, utilizing the Gini Importance metric to quantify the contribution of each variable to model predictions. Model performance was assessed using the R2 score, measuring the proportion of variance in A1C levels predictable from the independent variables, and hyperparameter tuning was performed using grid search with cross-validation to optimize the R2 score. Furthermore, the team tested the ability of LLMs, ChatGPT-4o, o1, o1-mini, and DeepSeek’s r1, to predict a patient’s level of diabetes control (low, medium, high) after removing all A1C mentions from the interview transcripts. Results demonstrate that LLMs can infer diabetes control from interview content alone, categorizing A1C values into low (A1C 7.5, 16.9%) control levels. Qualitative analysis of the LLM-generated summaries indicated relevance to the main topic, although some repetition across similar subtopics was observed, highlighting a potential area for refinement in guiding the LLM to retrieve more comprehensive and specific insights. This work establishes a pathway for integrating unstructured narrative data into conventional risk prediction workflows, potentially enhancing diabetes management and patient care.

LLMs unlock SDOH insights from patient stories, revealing

Scientists have demonstrated how large language models (LLMs) can extract valuable social determinants of health (SDOH) information from unstructured patient narratives. Researchers collected interviews from 65 patients aged 65 and older with Type 2 Diabetes, focusing on their life experiences and diabetes management, and then analysed these using LLMs. The LLMs successfully translated these narratives into both concise qualitative summaries and structured quantitative SDOH ratings, enabling their use in conventional risk prediction workflows. This work establishes a proof of concept for integrating unstructured patient narratives into clinical practice via LLMs, potentially transforming how SDOH and qualitative data are used in healthcare to deliver more personalised care.

LLMs achieved 60% accuracy in predicting a patient’s level of diabetes control directly from interview text, suggesting a scalable approach to clinical risk models and decision-making. The authors acknowledge the study’s limitations, including a relatively small sample size of 65 participants, which impacted the predictive performance of machine learning models. They also note the potential for social desirability bias in self-reported narratives and suggest that incorporating longitudinal A1C data could offer a more meaningful comparison. Future research will focus on comparing LLM predictions and justifications with physician experts to validate accuracy and address limitations. The team also plans to explore locally hosted LLMs to improve accessibility and scalability, while considering the practical requirements of physicians for seamless integration into clinical workflows. Ultimately, this research aims to move beyond a proof of concept towards a clinically validated, scalable system that enhances diabetes management by incorporating crucial social and contextual factors into clinical workflows and risk prediction models.

👉 More information
🗞 Structured Insight from Unstructured Data: Large Language Models for SDOH-Driven Diabetes Risk Prediction
🧠 ArXiv: https://arxiv.org/abs/2601.13388

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Curriculum Learning Achieves 37% Improvement in Cross-Domain Action Recognition

Curriculum Learning Achieves 37% Improvement in Cross-Domain Action Recognition

January 22, 2026
Twinbrainvla Achieves Robotic Control by Resolving Catastrophic Forgetting in VLMs

Twinbrainvla Achieves Robotic Control by Resolving Catastrophic Forgetting in VLMs

January 22, 2026
Single Electrons Resolve Qubit Excitations in Coupled Trapped-Ion Quantum Computer

Single Electrons Resolve Qubit Excitations in Coupled Trapped-Ion Quantum Computer

January 22, 2026