AI Achieves 99% Accuracy in Hierarchical Classification of Benign Laryngeal Voice Disorders

Nearly one in five people experience benign laryngeal voice disorders, often presenting as dysphonia and potentially indicating underlying health issues, and now a new approach promises more effective diagnosis. Mohsen Annabestani, Samira Aghadoost, and colleagues from Weill Cornell Medicine and Tehran University of Medical Sciences have developed an artificial intelligence system that accurately classifies these disorders using only recordings of sustained vowel sounds. The team constructed a hierarchical machine learning framework, which first identifies pathological voices, then categorises them broadly, and finally distinguishes between specific structural, inflammatory, and functional conditions, consistently exceeding the performance of existing AI models. This achievement demonstrates the potential of acoustic biomarkers as a scalable, non-invasive method for early detection, improved diagnostic workflows, and ongoing monitoring of vocal health.

Machine Learning Detects and Classifies Voice Disorders

This research details a machine learning framework for detecting and classifying voice disorders, addressing the need for objective tools to complement traditional, subjective assessments. The system analyzes acoustic features of speech to identify patterns indicative of pathology, employing algorithms including convolutional neural networks, extreme learning machines, and deep learning techniques. Experiments utilized 15,132 recordings from 1,261 speakers, demonstrating the potential of machine learning to accurately detect and classify voice disorders. Specific acoustic features, such as pitch variations, prove particularly important for accurate diagnosis, and the study reveals that continuous speech provides valuable information for improving model performance. While various algorithms show promise, the research emphasizes that no single approach is universally optimal, and reproducibility remains a key challenge for the field. The ultimate goal is to create tools that assist clinicians, not replace them, in areas like early screening, quality of life assessment, and analysis of voice changes related to conditions like COVID-19.

Machine Learning Diagnoses Benign Voice Disorders

This work presents a novel machine learning framework for classifying benign laryngeal voice disorders, a condition affecting nearly one in five individuals. Scientists developed a hierarchical system that automatically categorizes eight distinct voice disorder types, alongside healthy controls, using acoustic features extracted from sustained vowel sounds. The framework operates in three stages, mirroring clinical workflows, beginning with a screening process to differentiate pathological from non-pathological voices, integrating convolutional neural network analysis with 21 interpretable acoustic biomarkers. Subsequent stages stratify voices into broader groups and refine classification, demonstrably improving the discrimination between structural and inflammatory disorders compared to functional conditions. The system consistently outperforms standard multi-class classifiers and pre-trained speech models, achieving high accuracy by combining deep spectral representations with interpretable acoustic features. This delivers a scalable, non-invasive tool with potential for early screening, diagnostic triage, and ongoing monitoring of vocal health, representing a significant advancement in digital health technologies.

Deep Learning Diagnoses Laryngeal Voice Disorders

This research presents a novel machine learning framework for classifying benign laryngeal voice disorders, a condition affecting a significant portion of the population. The team developed a system that mimics clinical practice, first identifying pathological voices, then categorizing them by broad type, and finally diagnosing specific disorders. By integrating deep learning analysis of voice recordings with established acoustic biomarkers, the framework achieves high accuracy in distinguishing between different conditions, notably improving the identification of structural and inflammatory disorders. The system demonstrates a strong ability to differentiate between voice disorders, achieving a high diagnostic performance, indicating that even short, sustained vowel sounds contain substantial and reliable information for identifying diverse laryngeal conditions. Importantly, this framework surpasses the performance of more general speech and audio models, highlighting the value of tailoring analytical methods to the specific characteristics of clinical phonation. This scalable and non-invasive approach holds potential for widespread application in early screening, clinical assessment, and ongoing monitoring of vocal health, ultimately contributing to a reduction in the impact of dysphonia.

👉 More information
🗞 AI-Driven Acoustic Voice Biomarker-Based Hierarchical Classification of Benign Laryngeal Voice Disorders from Sustained Vowels
🧠 ArXiv: https://arxiv.org/abs/2512.24628

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Non-euclidean Interfaces Enable Exploration of Infinite Graphene Lattice Orientations

Non-euclidean Interfaces Enable Exploration of Infinite Graphene Lattice Orientations

January 8, 2026
Bayesian Transformers Achieve Diverse Intelligence with Sampling from a Single Model

Bayesian Transformers Achieve Diverse Intelligence with Sampling from a Single Model

January 8, 2026
Diffusion Language Models Achieve Optimal Parallel Sampling with Polynomial-Length Chains

Diffusion Language Models Achieve Optimal Parallel Sampling with Polynomial-Length Chains

January 8, 2026