Researchers are increasingly focused on mitigating the barren plateau problem that hinders optimisation in Variational Quantum Circuits (VQCs). Gerhard Stenzel, Tobias Rohe, and Michael Kölle, all from LMU Munich, alongside Leo Sünkel, Jonas Stein, and Claudia Linnhoff-Popien, also at LMU Munich, demonstrate a critical and often overlooked consequence of parameter sharing within VQCs. Their work reveals that while parameter sharing can reduce the complexity of quantum circuits, it simultaneously creates deceptive gradients, areas where optimisers receive misleading information, fundamentally altering the optimisation landscape. This research is significant because it establishes a quantitative framework for measuring optimisation difficulty and highlights a mismatch between classical optimisation strategies and the parameter landscapes generated by parameter sharing, offering vital considerations for the design of practical quantum circuits.
Can quantum circuits with shared components actually hinder their ability to learn effectively. It appears so, as sharing parameters creates misleading signals that confuse optimisation algorithms. These deceptive gradients become more pronounced with increased sharing, making it harder to find the best solution despite having fewer settings to adjust.
Researchers investigated how parameter sharing affects optimisation landscapes and the convergence of gradient-based optimisers. The study focused on demonstrating that increasing degrees of parameter sharing generates more complex solution landscapes with heightened gradient magnitudes and measurably higher deceptiveness ratios. Specifically, the work examined the impact of parameter sharing on the presence of deceptive gradients, regions where gradient information exists but systematically misleads optimisers away from global optima.
Through systematic experimental analysis, findings reveal that traditional gradient-based optimisers (Adam, SGD) show progressively degraded convergence as parameter sharing increases, with performance heavily dependent on hyperparameter settings. The research contributes to understanding the interaction between model architecture, optimisation landscapes, and the efficacy of gradient descent methods, particularly when employing parameter sharing to reduce model complexity and the number of parameters.
Gradient deceptiveness increases with parameter sharing hindering optimisation performance
Circuits with a parameter sharing degree of 0.8 exhibited gradient deceptiveness ratios reaching 0.67, indicating substantial misleading gradient information. Increasing parameter sharing to 0.95 resulted in deceptiveness ratios climbing to 0.82, demonstrating a clear correlation between sharing and optimisation difficulty. By quantifying this deceptiveness, the work establishes a framework for understanding how parameter sharing alters the optimisation field.
At a parameter sharing level of 0.95, Adam optimisation achieved a success rate of only 32% when targeting a cost function value below 0.1, compared to 85% for circuits without parameter sharing. Even with careful hyperparameter tuning, optimizers struggled as parameter sharing increased. For instance, the best performing Adam optimizer, configured with a learning rate of 0.01, still only achieved a 45% success rate at a parameter sharing degree of 0.95.
Unlike circuits without parameter sharing, where a broad range of learning rates yielded acceptable results, highly shared circuits demanded precise learning rate selection. The research details that circuit expressivity improved by approximately two orders of magnitude with increased parameter sharing, yet this gain was accompanied by the observed increase in optimisation difficulty.
The newly developed gradient deceptiveness detection algorithm proved effective at identifying misleading regions within the quantum optimisation field, with its reported deceptiveness ratio remaining stable regardless of sampling resolution. The algorithm consistently flagged areas where gradients pointed in directions opposite to the global optimum, confirming the presence of deceptive gradients.
Analyses revealed that gradient magnitudes increased with parameter sharing, contributing to the heightened deceptiveness. The work provides a quantitative measure of optimisation difficulty, allowing for a direct comparison of different circuit designs. For circuits employing distance-5 codes and a parameter sharing degree of 0.95, the average gradient magnitude was 1.7times higher than in circuits without parameter sharing.
As the optimisation field becomes more complex, traditional optimizers like Adam and SGD experienced progressively degraded convergence. The research highlights that achieving successful convergence required markedly more precise hyperparameter tuning in highly deceptive landscapes. These insights provide important considerations for quantum circuit design in practical applications, highlighting the fundamental mismatch between classical optimisation strategies and quantum parameter landscapes shaped by parameter sharing.
The emergence of quantum computing has catalysed the development of Quantum Machine Learning (QML), an interdisciplinary field that leverages quantum phenomena to potentially overcome classical computational limitations. However, VQC optimisation faces significant challenges, particularly the barren plateau phenomenon, where gradients vanish exponentially with increasing system size. Parameter sharing in VQCs has emerged as a promising strategy to address these challenges.
By reducing the parameter space dimensionality and enforcing useful symmetries, it offers potential advantages in optimisation efficiency and generalisation capabilities. The research confirms that parameter sharing can indeed lead to superior global optima while using fewer parameters, a seemingly ideal solution for quantum circuit design. However, this approach introduces a critical trade-off that has been largely overlooked: parameter sharing creates complex dependencies in gradient information that can fundamentally alter the optimisation field.
The work identifies a complementary challenge to barren plateaus: deceptive gradients, regions where gradient information exists but systematically misleads optimizers away from global optima. This deceptiveness emerges from the interaction of parameter interdependencies and quantum-specific effects with no classical analogue. This research investigates the fundamental question: while parameter sharing in VQCs can create better global optima with fewer parameters, what is the cost in terms of field deceptiveness and practical trainability?
The methodology includes the concept of resolution, the definition of deceptiveness, and the experimental design for comparing optimizer performance across different parameter sharing configurations. The experimental findings include a thorough analysis of optimizer trajectories under different parameter sharing conditions. Self et al. demonstrated that parameter sharing can be leveraged to solve related variational problems in parallel through their Bayesian Optimisation with Information Sharing (BOIS) approach.
By sharing quantum measurement results between different optimizers, they achieved a 100-fold improvement in efficiency compared to naive implementations. This technique is particularly valuable for computing properties across different physical parameters, such as energy surfaces for molecules at varying nuclear separations. Despite its benefits, parameter sharing introduces challenges in the optimisation field.
Wang et al. showed that appropriate parameter initialisation strategies are important when using shared parameters. By reducing the initial domain of each parameter inversely proportional to the square root of the circuit depth, they proved that the magnitude of the cost gradient decays at most polynomially with respect to the qubit. These circuits consist of parameterised quantum gates whose parameters are tuned through classical optimisation to minimise a cost function. A typical VQC workflow involves preparing an initial state, applying a parameterised unitary transformation, and measuring an observable to compute the cost function.
Data encoding represents another fundamental challenge in QML, as classical data must be mapped into quantum states through various embedding strategies. These include basis encoding, amplitude encoding, and angle encoding, each with different resource requirements and expressivity characteristics. The choice of encoding strategy markedly impacts the potential quantum advantage and the trainability of the resulting model.
Despite theoretical promise, QML faces several practical challenges, including hardware limitations, noise susceptibility, and the difficulty of loading classical data efficiently into quantum states. These circuits consist of parameterised quantum gates whose parameters are tuned through classical optimisation to minimise a cost function. A typical VQC workflow involves preparing an initial state, applying a parameterised unitary transformation, and measuring an observable to compute the cost function.
The design of VQCs presents a fundamental trade-off between expressivity and trainability. Highly expressive circuits can represent complex functions but often suffer from optimisation difficulties. Parameter sharing has emerged as a promising technique to navigate this trade-off by reducing the number of independent parameters while maintaining sufficient expressivity.
Parameter sharing in VQCs involves constraining certain parameters to have identical values, analogous to weight sharing in classical convolutional neural networks. This approach offers several advantages: it reduces the dimensionality of the optimisation field, decreases the number of parameters that need to be optimised, and can improve generalisation by enforcing symmetries in the circuit.
👉 More information
🗞 Illustration of Barren Plateaus in Quantum Computing
🧠 ArXiv: https://arxiv.org/abs/2602.16558
