Scientists are increasingly focused on the challenge of generating complex data distributions, a crucial task for advancing our understanding of numerous systems. Quoc Hoan Tran, Koki Chinzei, and Yasuhiro Endo, all from Fujitsu Research, alongside Hirotaka Oshima et al., now demonstrate the universality of the Many-body Projected Ensemble (MPE) framework for quantum machine learning. Their research proves that MPE can approximate any distribution of pure states with a guaranteed level of accuracy, filling a significant theoretical gap in the field and offering a rigorous foundation for expressive quantum models. Furthermore, the team proposes an improved, incrementally trainable variant of MPE, validated through numerical experiments on complex datasets, suggesting a pathway towards practical applications of this powerful technique.
Furthermore, the team proposes an improved, incrementally trainable variant of MPE, validated through numerical experiments on complex datasets, suggesting a pathway towards practical applications of this powerful technique., Scientists investigate the universality of approximation in Quantum machine learning (QML). They address the question of whether a parameterised QML model can approximate any quantum distribution? Numerical experiments support these findings.
Quantum data generation via generative models
Scientists are increasingly focused on quantum machine learning (QML) for processing quantum data derived from Quantum systems. A fundamental task in QML is generating quantum data by learning the underlying distribution, essential for understanding quantum systems, synthesizing new samples, and advancing applications in quantum chemistry and materials science. However, extending classical generative approaches to quantum data presents significant challenges, as quantum distributions exhibit superposition, entanglement, and non-locality that classical models struggle to replicate efficiently. Quantum generative models such as quantum Generative adversarial networks and quantum variational autoencoders can be used to prepare a fixed single quantum state, but are inefficient for generating ensembles of quantum states due to the need for training deep parameterized quantum circuits (PQCs).
The quantum denoising diffusion probabilistic model offers a promising approach that employs intermediate training steps to smoothly interpolate between the target distribution and noise, thereby enabling efficient training. However, the diffusion process requires high-fidelity scrambling random unitary circuits, demanding implementation challenges of precise spatio-temporal control. Learning quantum data distributions faces significant hurdles in the noisy intermediate-scale quantum (NISQ) era, including noise-induced errors, limited qubit connectivity, and optimization difficulties such as barren plateaus, where gradients vanish exponentially with system size. Moreover, achieving universality, which entails the model’s ability to approximate any quantum distribution with arbitrary precision, remains a significant theoretical and practical challenge.
These limitations underscore the need for innovative frameworks that combine theoretical guarantees of universality with scalable, noise-resilient training strategies. While practical QML Methods face issues such as classical simulability, barren plateaus, and high resource requirements in training (Appendix A 3), the universality theorem is viewed as complementary, ensuring that models can, in principle, capture any distribution before optimizing for hardware. Numerical experiments on clustered quantum states and computational chemistry datasets validate the efficacy of the framework. Generative models are powerful tools for generating samples from a target distribution and estimating the likelihood of given data points.
Researchers address the problem of learning an unknown quantum data distribution Qt over n-qubit pure states, given a training dataset S = {|ψ0⟩, ., |ψN−1⟩} of N independent states sampled from Qt. The generative model is defined by a parameterized probability distribution Qθ implemented via PQCs, where θ represents the trainable parameters (e. g., gate angles). The training objective is to optimize θ such that Qθ closely approximates Qt, as measured by a distance metric D(Qθ, Qt). Since directly computing D(Qθ, Qt) is often infeasible, a dataset S = {| ψj⟩}j is sampled from Qθ and the empirical distance D(S, S) is minimized.
In the inference phase, the optimized parameters θopt are fixed, and new quantum states |ψ⟩∼Qθopt are generated for use in quantum simulation and data analysis. For a density operator ρ acting on a Hilbert space (see Appendix A 1), the trace norm is defined as ∥ρ∥1 = Tr p ρ†ρ, where Tr denotes the trace operation and ρ† is the Hermitian conjugate of ρ. The trace distance between two density operators ρ and σ is d(ρ, σ) = 1 2∥ρ−σ∥1. This metric captures the distinguishability of two quantum states and serves as a fundamental measure in quantum information theory. To compare ensembles of quantum states, the 1-Wasserstein distance is employed, which extends the trace distance to probability distributions over density operators.
Definition III0.1 (1-Wasserstein Distance). Let P and Q be two probability measures (or ensembles) over the space of density operators. The 1-Wasserstein distance between P and Q is defined as the minimal expected trace distance between pairs of states sampled from a coupling of P and Q: W1(P, Q) = inf π∈Π(P,Q) E(ρ,σ)∼π 1 2∥ρ −σ∥1, (1) where Π(P, Q) denotes the set of couplings (joint probability measures) with marginals P and Q. An ensemble of states can be generated from a single wave function by performing local measurements over a part of the total system. Researchers consider a many-body system partitioned into a subsystem M (with nm qubits) and its complement A (with na qubits).
For unification, A is considered as the ancillary system [Fig0.1(c)]. Given a generator state |Φ⟩, which is a pure many-body wave function on the total system A + M, local measurements are performed on A, typically in the computational basis. This yields different pure states |φ(zA)⟩M on M, each corresponding to a distinct measurement outcome zA on A, which are bitstrings of the form, for example, |ψj⟩. These cells form a Voronoi partition assigning each state to the nearest center in the δ-net. qj = Qt(Cj) is defined, which is the probability that Qt assigns to all pure states. The MPE framework (Lemma IV0.3 or Lemma IV0.4) constructs a class of projected ensemble P = {(pj, |ψj⟩)}N−1 j=0 identified as Qθ such that there exist a parameter θ∗for which W1(Qθ∗, Q) ≤ε.
By the triangle inequality, W1(Qt, Qθ∗) ≤ W1(Qt, Q) + W1(Q, Qθ∗) ≤ε 2 + ε 2 = ε. The scaling of N and the number of ancilla qubits are detailed in the lemmas below. Lemma IV0.2 (Finite Ensemble Approximation). For any target quantum data distribution Qt over pure n-qubit states and any ε 0, there exists a finite ensemble Q = {(qj, |ψj⟩)}N−1 j=0 such that the 1-Wasserstein distance satisfies W1(Qt, Q) ≤ε. N = N(Qt, d, δ) ≤5 · D ln(D) · (1/δ)2(D−1). (2) where D is the D-dimensional subspace of the full Hilbert space C2n.
MPE guarantees universal expressivity for state design and
This breakthrough delivers a foundational understanding of the capabilities of parameterized quantum models in approximating arbitrary distributions. Researchers then defined a Voronoi partition, creating measurable cells assigning each state to the nearest centre in the δ-net, and defined qj as the probability assigned to each cell by the target distribution Qt. This partitioning facilitated the construction of a discrete approximation of Qt, represented by the ensemble Q = {(qj, |ψj⟩)}. Data shows the 1-Wasserstein distance between Qt and Q is bounded by W1(Qt, Q) ≤ ε/2, completing the proof of MPE’s universality.
The team then applied MPE to scenarios where the target distribution Qt is unknown but samples are available for training, utilising an ε/2-covering technique to construct a δ-net. Specifically, Lemma IV0.3 establishes that δ(p, q) ≤ ε/2, with nm = na + ⌈log2(1/ε)⌉ qubits in the hidden system M, where na is the number of qubits in the ancilla system A. Further refinement, detailed in Lemma IV0.4, achieves exact probability matching (p = q) with the same number of parameters, albeit utilising complex values. The resulting projected ensemble, P = {(pj, |ψj⟩)}, is designed to approximate Q = {(qj, |ψj⟩)}, with the Wasserstein distance between P and Q bounded by W1(P, Q) ≤ ε/2, as demonstrated by Lemma IV0.5. This work provides a rigorous theoretical foundation and a practical implementation for generating data from underlying distributions, with potential applications in diverse areas requiring accurate data modelling and simulation.
MPE universality and practical learning demonstrated significant progress
Numerical experiments conducted on both clustered states and the QM9 molecular dataset confirm MPE’s effectiveness in learning complex quantum data distributions. These findings lay a strong foundation for advancements in quantum generative modelling, potentially impacting fields such as quantum chemistry and materials science. The authors acknowledge that the sample complexity of the method can scale exponentially with the intrinsic dimension of the data, and future research may focus on reducing this scaling through assumptions about the target distribution’s smoothness or symmetries. Further work could also address the impact of imperfect gate operations on ancilla-assisted measurements and explore extending the framework to encompass mixed state distributions.
👉 More information
🗞 Universality of Many-body Projected Ensemble for Learning Quantum Data Distribution
🧠 ArXiv: https://arxiv.org/abs/2601.18637
