Sang Hyub Kim and colleagues at IonQ Inc. have shown that parity features, classical representations derived from binary or quantized inputs, can enhance machine learning models when dealing with complex data interactions. Learning optimal parity bases and encodings through hybrid quantum-classical training pipelines sharply improves performance across diverse benchmarks. Using systems with 5-10 qubits, the method achieves accuracy gains of up to 41.7% on native-binary parity tasks and recovers lost information on continuous text benchmarks, exceeding full continuous baselines on several datasets. Moreover, the team’s sPQC-Parity method delivers substantial improvements on encoding-limited discrete datasets, showcasing the potential for key inference even with quantized data.
Quantum parity features unlock superior performance across diverse datasets
A 94.6% improvement in accuracy on the mushroom dataset was achieved using the sPQC-Parity model, surpassing the performance of a standard PCA-bin baseline. This level of accuracy was previously unattainable with methods reliant on solely classical feature extraction. The substantial gain unlocks the potential of parity features to reveal information hidden within quantized data, enabling strong inference even when datasets are simplified or incomplete. The significance of this result lies in the ability to extract meaningful signals from data that has been deliberately reduced in precision, a common scenario in resource-constrained environments or when dealing with inherently noisy data sources. Traditional machine learning algorithms often struggle with such data, losing crucial information during the quantization process. Parity features, however, appear to be more resilient, preserving and even amplifying subtle patterns.
The team’s hybrid quantum-classical approach not only enhances performance on discrete datasets but also recovers lost information in continuous text benchmarks, exceeding full continuous baselines on datasets like CR, SST-2, and SST-5. Learned parity bases improved accuracy on native-binary, high-order parity tasks by between 23.9% and 41.7% compared to traditional logistic-regression and support-vector machine baselines. The sPQC-Parity model achieved 94.6%, 3.0%, and matching performance on the mushroom, splice, and promoter datasets respectively, when compared to a PCA-bin baseline. However, these gains currently rely on carefully constructed training pipelines and do not yet translate to strong performance on genuinely noisy, real-world data lacking the controlled conditions of these benchmarks. Effective basis discovery, pinpointing relevant parity words from a vast combinatorial space, primarily drove this improvement. Quantum moment computation played a secondary role. The process of basis discovery involves identifying which combinations of input bits are most informative for the task at hand. This is computationally challenging, as the number of possible parity words grows exponentially with the number of input bits. The researchers employed a hybrid approach, leveraging the quantum computer to efficiently estimate certain statistical moments needed for basis selection, while the majority of the computation was performed classically. Furthermore, the team’s learned projection encoding successfully recovered much of the information lost during binarization of continuous text embeddings, achieving superior results to full continuous baselines on benchmarks including CR, SST-2, and SST-5. This recovery is achieved by mapping the continuous embeddings into a binary space in a way that preserves the most important information, as determined by the learned parity basis. Performance on the promoter dataset matched a classical baseline, suggesting the benefits of this learned parity basis are not universal and depend heavily on the specific characteristics of the data. The promoter dataset, being a relatively simple classification task, may not require the complex feature interactions that parity features excel at capturing.
Learned parity features demonstrate promise but exhibit data-dependent performance gains
Identifying the right features is crucial for extracting meaningful patterns from data, and this work offers a new way to do so by leveraging ‘parity features’. These features represent combinations of binary inputs that reveal complex relationships. A parity feature is mathematically defined as the product of selected bits of a binary input, with each bit contributing either a positive or negative sign. This signed product effectively captures the interaction between those specific bits. Once the participating bits and their signs are determined, constituting the ‘parity word’, evaluating the feature requires only classical computation. This is because it simply involves bitwise multiplication and summation. While not a universal solution, the substantial gains observed across several datasets, including mushroom and splice, demonstrate the technique’s promise. These parity features, signed products of binary inputs, allow for efficient classical evaluation once established, bypassing some quantum resource needs. The ability to perform inference classically after a quantum-assisted training phase is a key advantage, potentially enabling the deployment of quantum-inspired machine learning models on conventional hardware.
A pathway for enhancing machine learning through parity features, combinations of input data points revealing complex relationships, is now established. Hybrid quantum-classical training enabled scientists to successfully discover and encode these features, improving performance on both binary and continuous datasets. This approach allows models to better represent interactions within data. The training process involves iteratively refining the parity basis and the projection encoding to minimise the prediction error on the training data. This is typically done using gradient descent, a standard optimisation algorithm in machine learning. Above all, the resulting models can perform inference entirely on classical hardware, termed ‘shadow deployment’, suggesting a viable route for integrating quantum-inspired techniques without requiring widespread quantum computers. This achievement prompts further investigation into adapting learned parity bases to new datasets and assessing durability with genuinely noisy, real-world data. Future research will likely focus on developing more robust training pipelines that can handle noisy data and on exploring methods for transferring learned parity bases between different datasets. Furthermore, investigating the theoretical limits of parity feature representation and comparing their performance to other feature extraction techniques will be crucial for establishing their long-term viability.
The research successfully demonstrated a method for enhancing machine learning models using parity features, which are signed products of binary inputs. This approach improves model performance on both 5-10 qubit binary tasks and continuous text benchmarks by discovering optimal parity bases during a hybrid quantum-classical training process. Importantly, these models can then perform inference entirely on classical computers, offering a potential pathway for utilising quantum-inspired machine learning without requiring quantum hardware. The authors intend to explore adapting these learned parity bases to new datasets and assessing their performance with noisy data.
👉 More information
🗞 Quantum Parity Representations: Learnable Basis Discovery, Encoders, and Shadow Deployment
🧠 ArXiv: https://arxiv.org/abs/2605.11213
