A new pre-computation strategy determines optimal penalisation weights within Quadratic Unconstrained Binary Optimisation (QUBO) formulations, offering guaranteed performance for Gibbs solvers and polynomial complexity across a range of problems. Edoardo Alessandroni and colleagues at the National University of Singapore have developed this method, validated through experiments on diverse instances and Fujitsu’s Digital Annealer. The strategy achieves sharply increased speed compared with current heuristic approaches, representing a key advance in solving complex computational problems. It tackles a persistent challenge in combinatorial optimisation, effectively balancing competing objectives and constraints when using approximate solvers.
Provable weight pre-computation accelerates Gibbs sampling for combinatorial optimisation
A pre-computation strategy for determining penalization weights with provable guarantees for Gibbs solvers and polynomial complexity for broad problem classes has been developed. Experiments across diverse problems and solver architectures, including large-scale instances on Fujitsu’s Digital Annealer, show strong performance and order-of-magnitude speedups over existing heuristics. Combinatorial optimisation problems arise widely in both theoretical research and real-world applications, aiming to identify the optimal configuration of discrete decision variables that minimizes a given objective function.
Methods for solving these problems have been a subject of intense study in both classical and quantum computation, including approaches based on Gibbs sampling or simulated annealing, special-purpose chips for digital annealing, and quantum primitives such as adiabatic evolution and different variants of the Quantum Approximate Optimisation Algorithm (QAOA). Practical solvers tend to be heuristic algorithms that yield approximate solutions. For example, simulated annealing (or any solver based on Gibbs sampling) outputs a configuration sampled from a low-temperature thermal distribution over the combinatorial space, with an objective value close to the optimum. The colder the distribution, the higher the probability that the output state is very close to the optimal solution.
More precisely, each configuration x is sampled with a probability proportional to e−E(x)β, where E(x) is the energy of the configuration and β is the inverse temperature of the system. Estimating the associated partition function or sampling from this probability distribution is generally hard, so practical solvers rely on Markov Chain Monte Carlo techniques and annealing schedules to approximate it. General combinatorial optimisation problems, arising in applications, usually include constraints that restrict the search space. A standard approach to tackle such problems with QUBO solvers is to convert constraints into penalization terms weighted by a constant, commonly referred to as Big-M. If the constant M is set too high, the low-energy spectrum becomes uninformative, forcing the solver to prioritize constraint satisfaction at all costs, returning feasible states that may be far from optimal with respect to the original objective function.
Conversely, a small value for M may cause approximate solvers to sample disproportionately from infeasible solutions that violate constraints, as the energies of infeasible configurations may lie near, or even below, the optimal feasible region. Existing computationally-efficient Big-M prescriptions tend to substantially overestimate the required penalty, which in practice degrades solution quality. This work introduces a new, broadly applicable algorithm for a priori determining the penalization term for a given constrained optimisation problem and a specified approximate solver.
The approach combines analytical considerations with uniform sampling over feasible configurations, to derive efficiently evaluable bounds on the solver’s output distribution, from which the penalization weight M is calculated. The algorithm yields a QUBO reformulation with a controllable, guaranteed minimum probability of sampling feasible solutions with energy at most Ef for exact Gibbs solvers at arbitrary β. It is also shown that, for large classes of constrained optimisation problems and appropriate hyperparameters, the algorithm’s runtime and memory scales polynomially in the system size. Numerical demonstrations show the practical applicability of the method across relevant parameter regimes, diverse problem instances and solver architectures.
The approach was benchmarked on representative constrained optimisation problems, including the Travelling Salesman Problem (TSP), the Multiway Number Partitioning Problem (MNPP), and Portfolio Optimisation (PO). The method can be used to determine penalization weights for Fujitsu’s Digital Annealer (version 3) on problem instances of up to several thousand bits. Although the Digital Annealer is known to deviate from the underlying assumption of thermal output distributions, the method still qualitatively captures its behaviour sufficiently well to achieve an order-of-magnitude speedup in time-to-solution compared to direct binary searches for M based on simpler heuristics. The work begins by formalizing the problem of determining penalization weights and empirically demonstrating the importance of the big-M problem for approximate solvers using Fujitsu’s Digital Annealer (version 3). Constrained optimisation problems minimise x∈{0,1}n E(o)(x) = xtQx subject to A x = b, given by Q ∈Rn×n, A ∈Zm×n, and b ∈Zm, are considered.
This special type of linearly constrained binary optimisation (LCBO) problems can capture complex problem formulations such as polynomially constrained problems and integer-variable problems, which can all be cast into this form using certain gadgets. A constrained problem in this form can be converted to a Quadratic Unconstrained Binary Optimisation (QUBO) problem by promoting the constraints A x = b to quadratic penalty terms E(p)(x) = (Ax −b)2, weighted by a penalization constant M. A common strategy for constrained combinatorial optimisation problems involves enforcing constraints within a quadratic unconstrained binary optimisation (QUBO) formulation by adding penalization terms. These terms introduce a hyperparameter affecting solver efficacy: the balance between objective and penalization terms.
A pre-computation strategy is developed to determine these penalization weights with provable guarantees for Gibbs solvers and polynomial complexity for broad problem classes. Experiments utilising diverse problems and solver architectures, including Fujitsu’s Digital Annealer, demonstrate performance improvements over existing heuristics. Combinatorial optimisation aims to identify the optimal configuration of discrete variables to minimise a given objective function, a computationally hard problem due to exponentially scaling search spaces.
Practical solvers are often heuristic algorithms yielding approximate solutions, such as Gibbs sampling which outputs configurations sampled from a thermal distribution. Hardware acceleration, like Fujitsu’s Digital Annealer, enhances energy landscape exploration. General problems usually include constraints, commonly converted into penalization terms weighted by a constant, often termed Big-M. The choice of this constant shapes the energy landscape; too high a value prioritises constraint satisfaction over optimality, while too low a value allows infeasible solutions to be sampled.
Current methods tend to overestimate the required penalty, degrading solution quality. A pre-computation strategy can determine penalization weights with provable guarantees for Gibbs solvers and polynomial complexity for broad problem classes. This approach addresses a common issue in constrained combinatorial optimisation, where constraints are enforced in QUBO formulations by adding penalization terms. These weights affect a solver’s efficacy, as approximate solvers do not always return the optimal point but a solution with a low objective value approximating the true minimum.
Enforcing constraints within a Quadratic Unconstrained Binary Optimisation (QUBO) formulation by adding penalization terms is a common approach to constrained combinatorial optimisation problems. These terms introduce a hyperparameter that affects the solver’s efficacy, specifically the balance between objective terms and the penalization terms. Selecting an excessively large value for this parameter is expected to diminish the quality of the solver’s results, creating what is known as the big-M problem.
Experiments using Fujitsu’s Digital Annealer demonstrate a need to carefully determine the value of this parameter for certain examples. Areas lacking mean energy points correspond to instances where only infeasible bitstrings were sampled for given parameter values. An observed transition from infeasible to feasible solutions as the parameter increases indicates that optimised values are required. However, a degradation in the quality of the sampled bitstrings is noticed.
The mean of the energy E(o) of the outputs of the sampler increases for larger values of M, beyond a seemingly sweet spot that is located around the transition. For reference, the M values suggested by the trivial choice are several orders of magnitude larger than the shown scales: around 109 for MNPP, 108 for benchmarks TSP and 1010 for circle TSP. Such extreme overshooting implies that the mean energy sampled by the solver would lie far from the desired minimum, undermining the optimisation’s effectiveness. Researchers are refining how approximate solvers tackle complex optimisation problems, particularly those with constraints.
This new methodology efficiently determines penalisation weights, values used to balance objectives, within constrained optimisation problems, offering a significant advance over current trial-and-error approaches. By combining analytical calculations with sampling, the technique provides guaranteed performance for Gibbs solvers, a type of probabilistic algorithm, while also scaling effectively with larger, more complex problems. Demonstrations on Fujitsu’s Digital Annealer show order-of-magnitude speedups, suggesting practical benefits for applications ranging from logistics to finance.
The research successfully developed a method for determining penalisation weights in quadratic unconstrained binary optimisation problems. This is important because these weights balance the objectives within complex calculations and significantly impact the effectiveness of solvers. The new methodology offers guaranteed performance for Gibbs solvers and demonstrated order-of-magnitude speedups when tested on instances using Fujitsu’s Digital Annealer. Researchers are now refining how approximate solvers tackle complex optimisation problems with constraints.
👉 More information
🗞 Scalable Determination of Penalization Weights for Constrained Optimizations on Approximate Solvers
🧠 ArXiv: https://arxiv.org/abs/2604.02416
