A covariance matrix adaptation evolution strategy with attention-based pruning (CAP) successfully reduces parameters in Mach-Zehnder interferometer (MZI)-based block neural networks by up to 80% with minimal performance loss (under 5%) on image datasets. CAP demonstrates superior robustness against noise compared to existing training algorithms, maintaining 88.5% accuracy with 60% parameter reduction.

The demand for increasingly complex artificial intelligence necessitates novel computational architectures. Optical neural networks (ONNs), utilising the properties of light to perform calculations, offer a potential pathway to overcome the limitations of conventional electronic systems. However, training these large-scale networks efficiently and robustly remains a significant challenge. Researchers at the State Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications – Zhiwei Yang, Zeyang Fan, Yihang Lai, Qi Chen, Tian Zhang, Jian Dai, and Kun Xu – address this issue in their article, ‘Efficient training for large-scale optical neural network using an evolutionary strategy and attention pruning’. Their work details a covariance matrix adaptation evolution strategy combined with attention-based pruning (CAP) to reduce the computational cost and improve the resilience of Mach-Zehnder Interferometer (MZI)-based block neural networks, demonstrating substantial parameter reduction with minimal performance loss, even when accounting for fabrication imperfections and noise.

Covariance Adaptation and Pruning Enhance Robustness in Large-Scale Optical Neural Networks

Optical neural networks represent a potential pathway to accelerate artificial intelligence tasks, offering advantages in speed and energy consumption compared to conventional digital architectures. However, scaling these networks to address complex problems presents substantial challenges relating to computational demands and susceptibility to hardware limitations. This work details a covariance matrix adaptation evolution strategy combined with attention-based pruning (CAP) – an algorithm designed to improve robustness and reduce the computational cost of large-scale Mach-Zehnder Interferometer (MZI)-based block neural networks (BONNs). By optimising population individuals directly and strategically pruning matrix blocks, the CAP algorithm achieves significant parameter reduction without compromising performance.

Current training methodologies often struggle with the complexities of large BONNs, exhibiting limited robustness and requiring considerable computational resources. These networks contain a large number of trainable parameters, leading to high computational cost and increased power consumption. The CAP algorithm addresses these limitations by integrating covariance matrix adaptation evolution with attention-based pruning, enabling efficient navigation of the parameter space and identification of critical network connections. This approach reduces computational burden and enhances resilience to noise and hardware variations.

The core of the CAP algorithm lies in its intelligent pruning of matrix blocks, removing redundant or less significant connections. This process is guided by an attention mechanism that identifies the most important connections, ensuring minimal performance degradation. By selectively removing these connections, the algorithm reduces the number of trainable parameters, resulting in a more compact and efficient network. The algorithm then employs a covariance matrix adaptation evolution strategy to fine-tune the remaining parameters, optimising performance and enhancing robustness. Covariance matrix adaptation is a technique used to adapt the learning rate of each parameter individually, based on the covariance between parameters, allowing for more efficient exploration of the parameter space.

To rigorously evaluate the performance of the CAP algorithm, extensive experiments were conducted using two benchmark datasets: MNIST and Fashion-MNIST. The results demonstrate that the CAP algorithm achieves a substantial reduction in the number of parameters, pruning 60% and 80% for the MNIST and Fashion-MNIST datasets, respectively, with accuracy decreasing by only 3.289% and 4.693%, respectively.

A critical consideration for any practical neural network implementation is its robustness to hardware imperfections and noise. To assess the robustness of the CAP algorithm, experiments were conducted simulating a poorly fabricated chip with dynamic noise in the phase shifters – components that control the interference of light within the MZI. The results demonstrate that the CAP algorithm exhibits less performance degradation compared to previously reported block adjoint training and standard covariance matrix adaptation evolution strategies. Specifically, the CAP algorithm exhibits a performance degradation of 22.327% and 24.019% for the MNIST and Fashion-MNIST datasets, respectively, while block adjoint training exhibits a degradation of 43.963% and 41.074%, and covariance matrix adaptation evolution exhibits a degradation of 25.757% and 32.871%.

To further validate the performance of the CAP algorithm, experimental validation was conducted on a pruned network achieving an accuracy of 88.5%, closely mirroring the 92.1% accuracy obtained from simulations without noise. This close agreement between experimental and simulation results provides strong evidence for the effectiveness of the CAP algorithm and its potential for real-world implementation.

Beyond the algorithm itself, the impact of network architecture on overall performance was investigated. The results demonstrate that constructing BONNs using MZIs with only internal phase shifters effectively reduces both system area and the number of trainable parameters. This architectural optimisation, combined with the CAP algorithm, offers a powerful approach to building compact and efficient optical neural networks.

The principles underlying the CAP algorithm – combining pruning with adaptive optimisation – are broadly applicable to a wide range of machine learning problems. By selectively removing redundant connections and fine-tuning the remaining parameters, more efficient and robust models can be built, requiring fewer computational resources and exhibiting greater resilience to noise.

In conclusion, this work presents a novel approach to building efficient and robust optical neural networks, introducing the CAP algorithm and demonstrating its effectiveness in reducing computational cost and enhancing robustness to hardware imperfections. The results demonstrate that the CAP algorithm can significantly reduce the number of parameters without sacrificing accuracy, making it a promising approach for building practical optical neural network implementations. Furthermore, the principles underlying the CAP algorithm are broadly applicable to a wide range of machine learning problems, offering a powerful approach to building more efficient and robust models. This work paves the way for the development of next-generation artificial intelligence systems that are faster, more energy-efficient, and more resilient to noise and hardware variations.

👉 More information
🗞 Efficient training for large-scale optical neural network using an evolutionary strategy and attention pruning
🧠 DOI: https://doi.org/10.48550/arXiv.2505.12906

Tags:

block neural networks covariance matrix adaptation evolution strategy dynamic noise. Fashion-MNIST dataset Mach-Zehnder Interferometers MNIST Dataset on-chip optimisation parameter reduction Phase Shifters Pruning

Quantum News

Efficient Neural Networks: Pruning and Robustness in Optical Computing.

Covariance Adaptation and Pruning Enhance Robustness in Large-Scale Optical Neural Networks

Latest Posts by Quantum News:

CSIRO Validates Potential of Quantum Batteries for Rapid Energy Storage

FCAT and Xanadu Release Research Enabling Approximate Pattern Discovery with Quantum Computers

HKUST Establishes Strategic Partnership to Support Deep Space Exploration