AI Models Replicated to Extract Clinical Info from Radiology Reports with 84.6% Accuracy

A study by researchers at New York University and Carnegie Mellon University has successfully developed an in-house Large Language Model (LLM) to extract clinical concepts from breast ultrasound reports, achieving an impressive average F1 score of 84.6%. This milestone demonstrates the feasibility of creating cost-effective and data-secure AI-powered systems for healthcare institutions, rivaling the performance of proprietary models like GPT4. The implications are significant, offering reduced costs, enhanced data security, and more accurate clinical information extraction – a game-changer in patient care.

The development of artificial intelligence (AI) models has revolutionized various industries, including healthcare. One such application is the extraction of clinical information from radiology reports. A recent study by researchers at New York University and Carnegie Mellon University aimed to create an in-house Large Language Model (LLM) for this purpose.

The study utilized a proprietary LLM like GPT4 to create a small subset of labeled data, which was then used to fine-tune a Llama38B model. When evaluated on a subset of reports annotated by clinicians, the proposed model achieved an average F1 score of 0.846. This performance is comparable to that of GPT4, demonstrating the feasibility of developing an in-house LLM for clinical concept extraction.

AI models like GPT4 have been shown to be effective in retrieving information from radiology reports. However, these models are often costly and raise concerns about data privacy when handling protected health information. The development of an in-house LLM offers a potential solution to these issues, as it can reduce costs and enhance data privacy.

The study’s findings have significant implications for the healthcare industry, particularly in the context of breast ultrasound reports. Breast ultrasound plays a crucial role in detecting and diagnosing breast abnormalities, and radiology reports summarize key findings from these examinations. However, extracting critical information from these reports is challenging due to their unstructured nature and varied linguistic styles.

Extracting clinical information from radiology reports is a complex task that poses several challenges. One of the primary issues is the unstructured nature of these reports, which often exhibit varied linguistic styles and inconsistent formatting. This makes it difficult for machines to accurately extract relevant information.

Another challenge is the proprietary nature of many AI models used in this context. While models like GPT4 have been shown to be effective in retrieving information from radiology reports, they are often costly and raise concerns about data privacy when handling protected health information. The development of an in-house LLM offers a potential solution to these issues.

The study’s findings demonstrate that it is feasible to develop an in-house LLM that not only matches the performance of GPT4 but also offers cost reductions and enhanced data privacy. This has significant implications for the healthcare industry, particularly in the context of breast ultrasound reports.

The proposed model achieved an average F1 score of 0.846 when evaluated on a subset of reports annotated by clinicians. This performance is comparable to that of GPT4, demonstrating the feasibility of developing an in-house LLM for clinical concept extraction.

While the study’s findings are promising, it is essential to note that the proposed model was fine-tuned using a small subset of labeled data created with the help of GPT4. The generalizability of the results to other datasets and scenarios remains to be seen.

AI models like GPT4 have been shown to be effective in retrieving information from radiology reports. However, these models are often costly and raise concerns about data privacy when handling protected health information. The development of an in-house LLM offers a potential solution to these issues.

The study presents a pipeline for developing an in-house LLM to extract clinical information from radiology reports. The proposed model achieved an average F1 score of 0.846 when evaluated on a subset of reports annotated by clinicians, demonstrating the feasibility of developing an in-house LLM for clinical concept extraction.

The use of AI models like GPT4 has been shown to be effective in retrieving information from radiology reports. However, these models are often costly and raise concerns about data privacy when handling protected health information. The development of an in-house LLM offers a potential solution to these issues.

The study’s findings have significant implications for the healthcare industry, particularly in the context of breast ultrasound reports. Breast ultrasound plays a crucial role in detecting and diagnosing breast abnormalities, and radiology reports summarize key findings from these examinations.

Extracting critical information from these reports is challenging due to their unstructured nature and varied linguistic styles. The development of an in-house LLM offers a potential solution to this issue, as it can provide cost reductions and enhanced data privacy.

The study’s findings demonstrate that it is feasible to develop an in-house LLM that not only matches the performance of GPT4 but also offers cost reductions and enhanced data privacy. This has significant implications for the healthcare industry, particularly in the context of breast ultrasound reports.

The study’s findings highlight the need for further research in this area. One potential direction for future research is to explore the generalizability of the proposed model to other datasets and scenarios.

Another potential direction for future research is to investigate the use of other AI models or techniques, such as transfer learning or multi-task learning, to improve the performance of the proposed model.

Developing an in-house LLM offers a potential solution to the challenges associated with extracting clinical information from radiology reports. However, further research is needed to fully realize the benefits of this approach and to address any limitations that may arise.

Publication details: “BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports”
Publication Date: 2024-10-28
Authors: Yuxuan Chen, H. Yang, Hengkai Pan, Fardeen A. Siddiqui, et al.
Source:
DOI: https://doi.org/10.1145/3688868.3689200

The Quant

The Quant

The Quant possesses over two decades of experience in start-up ventures and financial arenas, brings a unique and insightful perspective to the quantum computing sector. This extensive background combines the agility and innovation typical of start-up environments with the rigor and analytical depth required in finance. Such a blend of skills is particularly valuable in understanding and navigating the complex, rapidly evolving landscape of quantum computing and quantum technology marketplaces. The quantum technology marketplace is burgeoning, with immense growth potential. This expansion is not just limited to the technology itself but extends to a wide array of applications in different industries, including finance, healthcare, logistics, and more.

Latest Posts by The Quant:

University of Florida Unveils Nation’s Fastest University-Owned Supercomputer October 2025

University of Florida Unveils Nation’s Fastest University-Owned Supercomputer October 2025

October 15, 2025
Quantum eMotion Engages Lightship Security for NIST FIPS 140-3 Validation of Its Quantum Crypto Module

Quantum eMotion Engages Lightship Security for NIST FIPS 140-3 Validation of Its Quantum Crypto Module

October 7, 2025
Stack Overflow's Decline Caused by AI. From Developer Hub to Near-Extinction?

Stack Overflow’s Decline Caused by AI. From Developer Hub to Near-Extinction?

June 11, 2025