Clustering large datasets presents a significant challenge for current and near-term quantum computers, which struggle with both data input and limited qubit availability. To address this, Pedro Chumpitaz-Flores from University of South Florida, My Duong, and Ying Mao from Fordham University, alongside Kaixun Hua from University of South Florida, developed qc-kmeans, a new hybrid quantum algorithm. This method efficiently summarises data using a compact Fourier-feature sketch, reducing the need to load entire datasets onto a quantum processor, and then identifies cluster centres by solving simplified optimisation problems with shallow quantum circuits. The team demonstrates that qc-kmeans maintains competitive clustering accuracy in simulations, even with added noise, while crucially keeping the number of qubits required constant regardless of dataset size, representing a promising step towards practical quantum machine learning on near-term devices.

This approach achieves unbiased estimation with a mean-squared error of O(ε2) for specific parameters, and crucially, the maximum qubit requirement does not increase with the size of the dataset. A refinement process incorporating data retention ensures consistent improvement throughout the process. Simulations using standard quantum computing software, with minimal circuit depth, demonstrate the method’s performance on low-dimensional synthetic data, achieving competitive results compared to quantum baselines. The method was further tested on nine real-world datasets, containing up to 4. 3 × 105 data points.

Hybrid Quantum-Classical K-Means Clustering for NISQ Devices

This research addresses the challenge of applying quantum machine learning, specifically data clustering, to current and near-term quantum computers (NISQ devices). These computers are limited by qubit counts, coherence times, and susceptibility to errors, making many quantum algorithms impractical. The authors introduce qc-kmeans, a new hybrid quantum-classical algorithm designed for this purpose. It combines classical and quantum computation, leveraging the strengths of each. The algorithm uses a Fourier feature sketch to compress the dataset into a fixed-size representation, decoupling the quantum register width from the number of data points.

This allows the algorithm to handle large datasets without requiring a large number of qubits. The core quantum computation uses a shallow Quantum Approximate Optimization Algorithm (QAOA) with a specific type of interaction to select optimal cluster centers. A key advantage is the constant quantum register width, regardless of dataset size, a significant improvement over many other quantum clustering algorithms.

K-Means Clustering Performance, Quantum Versus Classical

This study investigated the performance of a novel quantum-classical clustering algorithm (QC-KMEANS) against traditional k-means and classical clustering methods across diverse datasets, including HEMI and URBANGB. Results demonstrate that QC-KMEANS consistently achieves competitive or superior clustering accuracy, as measured by Sum of Squared Errors (SSE). Specifically, QC-KMEANS reduced SSE by an average of 15-25% on several datasets. The QC-KMEANS_NOISE results showed higher SSE values, indicating sensitivity to noise. These findings suggest that quantum-inspired algorithms like QC-KMEANS have the potential to improve clustering performance, but are not a universal solution.

Fourier Feature Sketch Decouples Quantum Clustering

The team introduced qc-kmeans, a new hybrid quantum approach to data clustering designed for implementation on near-term quantum computers. This method addresses limitations imposed by restricted qubit numbers and circuit depths by compressing the dataset into a fixed-size Fourier-feature sketch. Subsequent cluster center selection relies on a shallow quantum approximate optimization algorithm, effectively decoupling the quantum resource requirements from the size of the input data. Across both synthetic and real-world datasets, the method achieves clustering quality comparable to classical algorithms while maintaining a constant qubit width and shallow circuit depth.

Importantly, the researchers acknowledge that qc-kmeans does not currently offer a speed advantage over established classical algorithms. The primary value of this work lies in its feasibility for near-term quantum hardware, where qubit counts and circuit depths are limited. The team also demonstrated robustness against noise through simulations, suggesting potential for practical application as quantum technology advances.

👉 More information
🗞 qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices
🧠 ArXiv: https://arxiv.org/abs/2510.22540

Tags:

Aer compressive-means elitist retention Fourier-feature sketch qc-kmeans surrogate cost

Qc-kmeans: Quantum Compressive K-Means Achieves Constant-Size Scaling with up to 9 Real Datasets and Mean-Squared Error for 1

Hybrid Quantum-Classical K-Means Clustering for NISQ Devices

K-Means Clustering Performance, Quantum Versus Classical

Fourier Feature Sketch Decouples Quantum Clustering

Rohail T.

Latest Posts by Rohail T.:

Mathematical Analysis Confirms a Long-Standing Conjecture About Special Function Values

Machine Learning Now Personalises Treatment Effects from Complex, Continuous Data

Researchers Develop Systems Equating 2 Diagram Classes with Verified Term Rewriting Rules