Hybrid Vision Models Achieve 85.69% Accuracy in PCOS Detection from Ultrasound

Polycystic Ovary Syndrome (PCOS) represents a significant and increasingly prevalent endocrine disorder affecting women globally, with a notable impact on those in Bangladesh. Researchers Md Mahmudul Hoque (CCN University of Science & Technology), Md Mehedi Hassain (International Islamic University Chittagong), and Muntakimur Rahaman (Multimedia University) et al. present a novel hybrid approach utilising vision models to improve the accuracy of PCOS detection from ultrasound scans. Their work addresses a critical need for efficient diagnostic tools, demonstrating that combining convolutional neural networks , specifically DenseNet121, Swin Transformer, ConvNeXt, ResNet18, and EfficientNetV2 , can achieve an impressive 98.23% accuracy with their optimised ‘DenConREST’ model. This advancement promises to substantially reduce diagnostic errors and enhance early detection rates for PCOS, offering a potentially life-changing improvement in women’s healthcare.

The study meticulously organised training and testing data into ‘infected’ (PCOS-positive) and ‘noninfected’ (healthy ovaries) categories, enabling a focused evaluation of model performance. This optimised architecture achieved a remarkable 98.23% accuracy in PCOS detection, significantly surpassing the performance of all other evaluated models. This substantial improvement underscores the efficacy of the synergistic combination of multiple deep learning architectures for complex medical image analysis. The research establishes an efficient solution for PCOS detection from ultrasound images, promising to reduce diagnostic errors and improve patient outcomes.

Experiments revealed that DenConREST’s superior performance stems from its ability to effectively capture both local features and long-range dependencies within the ultrasound images. By integrating the strengths of convolutional neural networks (CNNs), adept at local feature extraction, with the global contextual understanding provided by Transformer models, the hybrid approach offers a more comprehensive analysis of ovarian structures. Previous studies utilising CNNs have achieved accuracies around 85%, while other approaches, like the ITL-CNN architecture, reached nearly 98%, but this work surpasses these results with a robust and versatile model. The innovative use of a reduced window size of 6 within the Swin Transformer architecture further enhances the model’s focus on critical tissue characteristics.

Furthermore, the research builds upon recent advances in hybrid CNN-Transformer architectures, demonstrating their potential to improve diagnostic accuracy and feature representation in medical imaging. The team’s methodology included a crucial preprocessing step to remove corrupted images from the dataset, ensuring data reliability and preventing errors during model training. Utilising a dataset collected from Kaggle, comprising 781 infected and 1,143 noninfected training images, alongside 787 infected and 1,145 noninfected testing images, the study provides a robust foundation for future research and clinical applications. This work opens avenues for developing automated diagnostic tools that can assist clinicians in early and accurate PCOS detection, ultimately improving healthcare for women worldwide.

Ultrasound Image Dataset Preparation and Preprocessing is crucial

Scientists initiated this work by assembling a dataset of 781 infected and 1,143 noninfected ultrasound images for training, supplemented by 787 infected and 1,145 noninfected images for testing, all sourced from Kaggle. To ensure data integrity, the research team employed the Python Pillow (PIL) library to recursively scan and verify each image, systematically removing any corrupted or unreadable files before model training commenced. All input images underwent resizing to a uniform dimension of 224×224 pixels, preserving the aspect ratio while preparing them for neural network input. Subsequently, images were converted into PyTorch tensors, transitioning from the PIL image format (height×width×channels) to the tensor format (channels×height×width), and pixel values were scaled to a 0-1 range for optimal processing.

The team then normalized the data using ImageNet statistics, specifically, a mean of 0.485, 0.456, and 0.406, and a standard deviation of 0.229, 0.224, and 0.225, to center the data around zero and improve model performance. Researchers then pioneered two novel hybrid models, ‘DenConST’ and ‘DenConREST’, combining convolutional and Transformer-based approaches for enhanced PCOS detection0.69%0.23% accuracy. This optimized model, DenConREST, outperformed all other evaluated models, establishing it as the most effective solution for PCOS detection from ultrasound images, significantly improving diagnostic accuracy and reducing detection errors0.69% in identifying PCOS0.23%.

Data shows that DenConREST consistently demonstrated the highest performance across all evaluated models, marking a substantial improvement in diagnostic capabilities. The team measured performance using accuracy as the primary metric, quantifying the model’s ability to correctly classify ultrasound images as either PCOS-positive or healthy. Results demonstrate the potential for this research to significantly improve diagnostic accuracy and reduce detection errors in PCOS screening. Tests prove that the developed models effectively analyse ovarian ultrasound images, identifying key morphological features indicative of the syndrome.

The study utilised a dataset comprising 781 infected images and 1,143 noninfected images for training, alongside 787 infected and 1,145 noninfected images for testing. Measurements confirm a clear visual distinction between infected and noninfected ovaries in the ultrasound scans, with infected ovaries exhibiting characteristic follicular patterns. Researchers implemented a preprocessing step using the Python Pillow library to remove corrupted images, ensuring data integrity and improving model robustness. The resulting model demonstrates significant improvements in diagnostic accuracy and a reduction in detection errors compared to existing methods. The high recall rate of 99.9%, alongside a 98.4% F1-score, indicates the model’s ability to correctly identify nearly all PCOS cases, which is crucial for timely diagnosis and intervention.

This work addresses the growing need for efficient PCOS detection, particularly in healthcare settings with limited resources and a shortage of imaging specialists. The authors acknowledge the need for further validation through multicenter studies and the potential benefit of incorporating hormonal biomarkers to create a multimodal diagnostic approach. Future research will focus on expanding the dataset and refining the model for broader clinical applicability.

👉 More information
🗞 Vision Models for Medical Imaging: A Hybrid Approach for PCOS Detection from Ultrasound Scans
🧠 ArXiv: https://arxiv.org/abs/2601.15119

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Qufid Advances Quantum Program Fidelity Estimation with Adaptive Measurement Budgets

Qufid Advances Quantum Program Fidelity Estimation with Adaptive Measurement Budgets

January 23, 2026
Scsimulator Achieves Supply Chain Partner Selection Via LLM-Driven Multi-Agent Simulation

Scsimulator Achieves Supply Chain Partner Selection Via LLM-Driven Multi-Agent Simulation

January 23, 2026
Lookbench Advances Fashion Image Retrieval with Live, Challenging Benchmarks and Timestamps

Lookbench Advances Fashion Image Retrieval with Live, Challenging Benchmarks and Timestamps

January 23, 2026