Qc-kmeans: Quantum Compressive K-Means Achieves Constant-Size Scaling with up to 9 Real Datasets and Mean-Squared Error for 1

Clustering large datasets presents a significant challenge for current and near-term quantum computers, which struggle with both data input and limited qubit availability. To address this, Pedro Chumpitaz-Flores from University of South Florida, My Duong, and Ying Mao from Fordham University, alongside Kaixun Hua from University of South Florida, developed qc-kmeans, a new hybrid quantum algorithm. This method efficiently summarises data using a compact Fourier-feature sketch, reducing the need to load entire datasets onto a quantum processor, and then identifies cluster centres by solving simplified optimisation problems with shallow quantum circuits. The team demonstrates that qc-kmeans maintains competitive clustering accuracy in simulations, even with added noise, while crucially keeping the number of qubits required constant regardless of dataset size, representing a promising step towards practical quantum machine learning on near-term devices.

This approach achieves unbiased estimation with a mean-squared error of O(ε2) for specific parameters, and crucially, the maximum qubit requirement does not increase with the size of the dataset. A refinement process incorporating data retention ensures consistent improvement throughout the process. Simulations using standard quantum computing software, with minimal circuit depth, demonstrate the method’s performance on low-dimensional synthetic data, achieving competitive results compared to quantum baselines. The method was further tested on nine real-world datasets, containing up to 4. 3 × 105 data points.

Hybrid Quantum-Classical K-Means Clustering for NISQ Devices

This research addresses the challenge of applying quantum machine learning, specifically data clustering, to current and near-term quantum computers (NISQ devices). These computers are limited by qubit counts, coherence times, and susceptibility to errors, making many quantum algorithms impractical. The authors introduce qc-kmeans, a new hybrid quantum-classical algorithm designed for this purpose. It combines classical and quantum computation, leveraging the strengths of each. The algorithm uses a Fourier feature sketch to compress the dataset into a fixed-size representation, decoupling the quantum register width from the number of data points.

This allows the algorithm to handle large datasets without requiring a large number of qubits. The core quantum computation uses a shallow Quantum Approximate Optimization Algorithm (QAOA) with a specific type of interaction to select optimal cluster centers. A key advantage is the constant quantum register width, regardless of dataset size, a significant improvement over many other quantum clustering algorithms.

K-Means Clustering Performance, Quantum Versus Classical

This study investigated the performance of a novel quantum-classical clustering algorithm (QC-KMEANS) against traditional k-means and classical clustering methods across diverse datasets, including HEMI and URBANGB. Results demonstrate that QC-KMEANS consistently achieves competitive or superior clustering accuracy, as measured by Sum of Squared Errors (SSE). Specifically, QC-KMEANS reduced SSE by an average of 15-25% on several datasets. The QC-KMEANS_NOISE results showed higher SSE values, indicating sensitivity to noise. These findings suggest that quantum-inspired algorithms like QC-KMEANS have the potential to improve clustering performance, but are not a universal solution.

Fourier Feature Sketch Decouples Quantum Clustering

The team introduced qc-kmeans, a new hybrid quantum approach to data clustering designed for implementation on near-term quantum computers. This method addresses limitations imposed by restricted qubit numbers and circuit depths by compressing the dataset into a fixed-size Fourier-feature sketch. Subsequent cluster center selection relies on a shallow quantum approximate optimization algorithm, effectively decoupling the quantum resource requirements from the size of the input data. Across both synthetic and real-world datasets, the method achieves clustering quality comparable to classical algorithms while maintaining a constant qubit width and shallow circuit depth.

Importantly, the researchers acknowledge that qc-kmeans does not currently offer a speed advantage over established classical algorithms. The primary value of this work lies in its feasibility for near-term quantum hardware, where qubit counts and circuit depths are limited. The team also demonstrated robustness against noise through simulations, suggesting potential for practical application as quantum technology advances.

👉 More information
🗞 qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices
🧠 ArXiv: https://arxiv.org/abs/2510.22540

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Control Methods Gain Stability Against Hardware Errors with New Optimisation Technique

Mathematical Analysis Confirms a Long-Standing Conjecture About Special Function Values

February 14, 2026
Quantum Architecture Shrinks Computing Needs to under 100 000 Qubits

Machine Learning Now Personalises Treatment Effects from Complex, Continuous Data

February 14, 2026
Researchers Develop Systems Equating 2 Diagram Classes with Verified Term Rewriting Rules

Researchers Develop Systems Equating 2 Diagram Classes with Verified Term Rewriting Rules

February 14, 2026