Efficient Chips Boost Speed of Complex Calculations

Researchers are tackling the limitations of energy-efficient stochastic optimisation through a novel hardware architecture utilising probabilistic bits, or p-bits. Naoya Onizawa, Taiga Kubuta, and Duckgyu Shin, all from the Research Institute of Electrical Communication, Tohoku University, alongside Takahiro Hanyu and colleagues, present a fully-connected quantum-inspired simulated annealer designed to overcome scalability and memory overhead issues common in existing p-bit systems. This work is significant because it introduces a dual-BRAM architecture and a unique update schedule that enables efficient implementation on Field Programmable Gate Arrays, achieving substantial reductions in both energy consumption and logic resource usage, up to 50% and 90% respectively, as demonstrated on an 800-node benchmark problem. The findings highlight the potential of p-bit-based hardware for addressing large-scale combinatorial optimisation challenges within constrained energy and resource budgets.

Imagine searching for the lowest point in a vast, hilly field blindfolded. This new system mimics that process using electronic components, but does so with remarkable efficiency. It represents a step towards solving complex problems, from logistics to materials discovery, using far less power than current methods. Scientists are developing new hardware inspired by the principles of quantum mechanics to tackle complex optimisation problems.

While true quantum computers remain a distant prospect, researchers have been exploring ways to mimic certain quantum behaviours using conventional electronics. A recent advance focuses on probabilistic bits, or p-bits, which represent information using probabilities rather than definitive zero or one states. These p-bits offer a potentially energy-efficient way to perform stochastic optimisation, a technique used to find the best solution from a vast number of possibilities.

Previous designs employing p-bits struggled with scalability and supporting fully connected networks, limitations stemming from increased fan-out and memory demands. This design allows for scalable support of fully connected Ising models, a mathematical framework often used to represent optimisation problems, while simultaneously minimising the growth of logic resources needed.

Unlike earlier systems, the new architecture leverages SSQA to achieve rapid convergence by utilising only the final states of the replicas, thereby reducing memory requirements. The design centralizes delays within the FPGA’s block RAM using a dual-BRAM delay-line architecture, keeping the number of logic gates and flip-flops relatively constant regardless of the problem size.

Once implemented on a Xilinx ZC706 FPGA, the system successfully solved an 800-node MAX-CUT benchmark, a standard test case for optimisation algorithms. Compared to previous FPGA-based p-bit architectures, this new system achieved up to 50% reduction in energy consumption and over 90% reduction in logic resources. These results suggest a viable path toward building practical, p-bit-based hardware for large-scale combinatorial optimisation, particularly in scenarios where energy and resource constraints are strict.

Applications in logistics, financial modelling, and machine learning could all benefit from such efficient optimisation capabilities. Beyond immediate performance gains, the architecture’s design opens possibilities for deploying these algorithms in embedded systems and other resource-limited environments.

Reduced resource usage and bounded computation in a novel stochastic annealing architecture

At 800 nodes, the MAX-CUT benchmark was solved using the new architecture, achieving up to a 50% reduction in energy consumption and over a 90% reduction in logic resource usage. These improvements stem from a refined stochastic simulated quantum annealing (SSQA) platform adapting a spin-gate circuit to a spin-serial schedule. The dual-BRAM delay-line architecture eliminates fan-out and routing bottlenecks previously seen in stochastic simulated annealing (SSA) accelerators, achieving flat scaling by keeping LUT and flip-flop (FF) usage nearly constant as the spin count increases.

Once implemented, the system requires N + 1 cycles to compute per spin, bounding the computational cost. The research details the performance of p-bit-based simulated annealing (pSA) for Ising models, where each p-bit output state, σi(t + 1), is determined by the sign of ri(t) plus the hyperbolic tangent of Ii(t + 1). Inside the Ising spin network, the input to each p-bit, Ii(t + 1), is calculated using outputs from other p-bits and defined as I0 hi + the sum of Jij · σj(t).

Stochastic computing (SC) approximates pSA, known as stochastic simulated annealing (SSA). Unlike conventional binary computation, SC uses probability and randomness for low-cost, fault-tolerant computing. For SSQA, R replicas are used, each containing N p-bits based on the Ising model, connected by interaction coefficients Q, mimicking quantum annealing on a classical computer.

Beyond the benchmark results, p-bits offer tunable randomness and sub-nanosecond switching, enabling energy-efficient realization of algorithms like Boltzmann machines and Gibbs sampling. This architecture uses a replica-parallel/spin-serial scheduling approach, rather than relying on spin-parallel datapaths that do not scale.

Probabilistic bit implementation of stochastic simulated quantum annealing

A 72-qubit superconducting processor forms the foundation of this work, though the research utilizes a fundamentally different approach to stochastic computation using probabilistic bits, or p-bits. These p-bits represent numbers as streams of stochastic bit sequences, allowing arithmetic operations to be performed with simple logic gates, unlike conventional binary computation.

Stochastic computing achieves low-cost, fault-tolerant, and energy-efficient computing by exploiting probability and randomness. The study approximates stochastic simulated annealing (SSA) using this stochastic computing technique, with the core methodological advancement lying in the implementation of stochastic simulated quantum annealing (SSQA), an alternative p-bit-based approach that mimics quantum annealing on a classical computer.

At the heart of SSQA is a spin network comprising R replicas, each containing N p-bits based on the Ising model, where adjacent layers are connected via interaction coefficients Q. The spin-update algorithm is designed around stochastic computing, extending the principles of SSA. Specifically, the update for the i-th spin in the k-th replica calculates Ii,k(t + 1) using inputs from other p-bits, incorporating a pseudo inverse temperature and random noise.

A dual-BRAM delay-line architecture is central to the system’s scalability, enabling support for fully connected Ising models while avoiding growth in logic resources. By exploiting SSQA, the architecture rapidly converges using only the final replica states, thereby reducing memory requirements compared to conventional p-bit-based methods. For the i-th spin in the k-th replica, the algorithm computes Isi,k(t + 1) and σi,k(t + 1) based on the internal signals Ii,k(t + 1), approximating the tanh function using stochastic computing.

The interaction constant Q(t) between replicas is carefully controlled to guide the optimisation process, increasing incrementally over annealing steps to allow initial independent exploration before enhancing inter-replica coupling to encourage convergence. This mechanism, inspired by quantum-inspired tunneling behaviour, enables efficient solution search under strict energy and resource constraints, and was implemented on a Xilinx ZC706 FPGA.

FPGA architecture advances probabilistic computation through efficient memory management

The persistent challenge of solving complex optimisation problems has long demanded computational approaches that move beyond the limits of traditional digital computers. Recent work detailing a new field programmable gate array (FPGA) architecture for stochastic simulated quantum annealing offers a compelling step towards that goal. Instead of seeking ever-faster processors, these researchers have focused on embracing inherent randomness to find good, if not perfect, solutions with minimal energy expenditure.

This isn’t about building a quantum computer, but about borrowing principles from quantum mechanics to improve classical computation. Previous attempts to build hardware around probabilistic bits, the core of this approach, have struggled with scalability and the sheer volume of memory needed to represent complex relationships between variables. A dual-BRAM architecture cleverly manages memory access and minimises signal distribution, overcoming these hurdles.

The ability to handle fully connected networks, where every variable interacts with every other, is now demonstrated on a moderately sized benchmark problem. The reduction in logic resources and energy consumption compared to earlier FPGA designs is a tangible achievement, suggesting a path towards more practical applications. For years, the promise of solving problems like logistical routing, financial modelling, and materials discovery with these methods has remained largely theoretical.

By demonstrating a significant decrease in energy use alongside improved performance, this work moves those applications closer to reality. The current implementation is limited to a specific problem size and FPGA hardware. Questions remain about how well this architecture will perform on even larger, more complex instances and whether the benefits will translate to other optimisation algorithms.

The field needs to explore methods for automatically mapping real-world problems onto this probabilistic hardware. Further research should also investigate the potential of combining this FPGA-based approach with other acceleration techniques, such as near-memory computing. In the end, the true test will be whether this technology can deliver solutions to problems that are genuinely intractable for conventional computers, opening up new possibilities in areas currently beyond our reach.

👉 More information
🗞 Energy-Efficient p-Bit-Based Fully-Connected Quantum-Inspired Simulated Annealer with Dual BRAM Architecture
🧠 ArXiv: https://arxiv.org/abs/2602.16143

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Quantum Gates Built Using New Physics Principles

Quantum Gates Built Using New Physics Principles

February 19, 2026
Molecules Isolated with Precision for Advanced Measurements

Molecules Isolated with Precision for Advanced Measurements

February 19, 2026
Robust Ions Boost Precision Timekeeping Potential

Robust Ions Boost Precision Timekeeping Potential

February 19, 2026