Quantum Circuits Synthesise Realistic Fraud Data to Beat Detection Bias

Adam Innan and colleagues at Hassan II University of Casablanca, in collaboration with NYUAD Research Institute and New York University, have developed Q-SYNTH, a new hybrid quantum-classical framework for generating synthetic fraudulent data and improving detection rates. The framework addresses the rarity of fraudulent transactions, a problem that biases conventional machine learning algorithms. Q-SYNTH explores the potential of combining quantum computing with classical techniques to address challenges posed by imbalanced datasets, exhibiting a favourable balance between statistical similarity to real fraud and overall detection performance. It utilises a quantum circuit as a generator and a classical neural network as a discriminator, offering a flexible avenue for augmenting tabular data and enhancing fraud-class recall and F1-score. The increasing prevalence of digital transactions necessitates robust fraud detection systems, and the inherent difficulties in training effective models with imbalanced data have driven research into innovative solutions like Q-SYNTH.

Quantum-classical synthesis surpasses GANs in modelling complex fraud distributions

Kolmogorov-Smirnov statistics improved by 35% with Q-SYNTH compared to a classical Generative Adversarial Network, or GAN, baseline. This reduction crosses a critical threshold previously unattainable with standard GANs, which struggled to accurately model complex fraud distributions. The Kolmogorov-Smirnov test, a non-parametric measure of the maximum distance between the cumulative distribution functions of two samples, is particularly sensitive to differences in data distribution. A 35% improvement suggests a substantial reduction in the mismatch between the statistical properties of synthetic and real fraud data, enabling more reliable data augmentation and potentially leading to more generalizable fraud detection models. Classical GANs often suffer from mode collapse, where the generator produces a limited variety of synthetic samples, failing to capture the full complexity of the real fraud distribution. Q-SYNTH’s quantum component appears to mitigate this issue, allowing for a more diverse and representative synthetic dataset.

Q-SYNTH, a hybrid quantum-classical framework, addresses the long-standing challenge of extreme class imbalance in fraud detection by generating realistic minority-class samples. A Kolmogorov-Smirnov statistic of 0.069 was achieved by the system, alongside a p-value of 0.196, indicating a statistically insignificant difference between the distributions of real and synthetic fraud data. This low KS statistic and corresponding p-value provide strong evidence that the synthetic data generated by Q-SYNTH is statistically indistinguishable from the real fraud data. Recall reached 0.9424, representing the proportion of actual fraudulent transactions correctly identified after augmentation, exceeding the performance of both the classical GAN (recall of 0.939) and the WGAN-GP (recall of 0.71). Recall is a crucial metric in fraud detection, as failing to identify fraudulent transactions can result in significant financial losses. The substantial improvement in recall demonstrates Q-SYNTH’s effectiveness in augmenting the training data with realistic fraudulent samples, thereby improving the model’s ability to detect genuine fraud. A combined similarity score, averaging inverted KS statistics and p-values, also favoured Q-SYNTH, registering at 0.89 compared to 0.8881 for SMOTE, a common oversampling technique. SMOTE (Synthetic Minority Oversampling Technique) creates synthetic samples by interpolating between existing minority-class instances, but it can sometimes generate unrealistic samples or introduce noise. The slightly higher combined score for Q-SYNTH suggests that its quantum-enhanced generation process produces more statistically sound and useful synthetic data.

While Q-SYNTH offers a beneficial compromise between statistical accuracy and practical detection rates, these results are currently limited to tabular data and do not yet demonstrate performance gains on more complex data types or in real-time transaction environments. Tabular data, consisting of structured rows and columns, is a common format for financial transaction records. However, fraud detection increasingly involves analysing more complex data types, such as transaction graphs or textual descriptions. Extending Q-SYNTH to handle these data types will require significant modifications to the quantum circuit and the classical discriminator. Furthermore, the computational cost of running quantum circuits may be prohibitive for real-time fraud detection, where transactions need to be processed with minimal latency. Detecting fraudulent transactions remains a vital task for financial institutions, particularly as criminals develop increasingly sophisticated methods. The financial impact of fraud is substantial, and the cost of false positives (incorrectly flagging legitimate transactions as fraudulent) can also be significant, damaging customer relationships. Standard oversampling methods, like SMOTE, still demonstrate strong performance in replicating the subtle features of genuine fraudulent activity, prompting consideration of whether the benefits of Q-SYNTH truly justify the added complexity of incorporating quantum computing. The trade-off between computational cost, statistical accuracy, and detection performance needs to be carefully evaluated before Q-SYNTH can be deployed in a production environment.

This acknowledges a subtle field of competing techniques, and further research is needed to fully assess the advantages of this new approach. Quantum computing is being explored to enhance fraud detection, a field hampered by imbalanced datasets where genuine transactions vastly outnumber fraudulent ones. The potential of quantum machine learning lies in its ability to explore high-dimensional feature spaces and identify complex patterns that are difficult for classical algorithms to discern. Q-SYNTH attempts to resolve this long-standing problem through a novel methodology. The new hybrid classical-quantum framework successfully generates synthetic fraud data for tabular datasets, addressing a key limitation in detecting rare but vital financial crimes. The framework’s architecture allows for the incorporation of more sophisticated quantum algorithms and the exploration of different quantum circuit designs.

It combines a quantum circuit, functioning as a ‘forger’ of fraudulent transactions, with a classical neural network acting as a ‘detective’ to discern real from fake data. The quantum circuit leverages the principles of superposition and entanglement to generate a diverse range of synthetic samples, while the classical neural network provides a robust and efficient means of evaluating their realism. By iteratively challenging each other, both components improve, resulting in more realistic synthetic data and enhanced detection capabilities. This adversarial training process is similar to that used in classical GANs, but the quantum generator introduces a unique set of capabilities. Q-SYNTH’s ability to balance statistical accuracy with downstream performance justifies further investigation into hybrid quantum-classical approaches, even if immediate gains are modest, as the system’s architecture allows for continuous refinement of both the synthetic data generation and the fraud detection models. Future research should focus on scaling the quantum circuit, exploring different quantum algorithms, and evaluating the performance of Q-SYNTH on more complex data types and in real-time transaction environments.

Q-SYNTH successfully generated synthetic fraud data using a hybrid classical-quantum framework designed for tabular datasets. This addresses the challenge of detecting rare fraudulent transactions, which often biases conventional fraud detection systems. The research demonstrates a balance between the statistical similarity of generated data and its usefulness in improving fraud detection performance, offering a feasible approach to quantum data augmentation. Future work intends to scale the quantum circuit and explore its performance with more complex data, potentially refining both data generation and detection models.

👉 More information
🗞 Q-SYNTH: Hybrid Quantum-Classical Adversarial Augmentation for Imbalanced Fraud Detection
🧠 ArXiv: https://arxiv.org/abs/2605.21164

Stay current. See today’s quantum computing news on Quantum Zeitgeist for the latest breakthroughs in qubits, hardware, algorithms, and industry deals.
Muhammad Rohail T.

Latest Posts by Muhammad Rohail T.: