Optimal Spiking Brain Compression (OSBC) efficiently compresses spiking neural networks in a single pass by minimising loss on neuron membrane potential. Experiments across multiple datasets demonstrate OSBC achieves 97% sparsity with minimal accuracy loss—1.41%, 10.20%, and 1.74%—or 4-bit quantization with losses of 0.17%, 1.54%, and 7.71%.
The pursuit of energy-efficient artificial intelligence is driving research into novel neural network architectures. Spiking Neural Networks (SNNs), which more closely mimic biological neurons, offer potential advantages in power consumption, particularly when deployed on dedicated hardware. However, the computational demands of these networks necessitate techniques to reduce their size and complexity without significant performance degradation. Researchers at the University of Bristol – Lianfeng Shi, Ao Li, and Benjamin Ward-Cherrier – detail a new method for compressing SNNs in a single pass, avoiding the iterative training cycles of existing approaches. Their work, entitled ‘Optimal Spiking Brain Compression: Improving One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks’, presents a framework that minimises loss in neuron membrane potential, achieving high sparsity through pruning and enabling low-bit quantization with minimal accuracy loss across multiple datasets.
Efficient Compression of Spiking Neural Networks with Optimal Spiking Brain Compression
Spiking Neural Networks (SNNs) offer a potentially energy-efficient paradigm for neuromorphic computing, but their computational demands often limit practical implementation. Current methods for reducing SNN complexity typically employ iterative pruning and quantization, requiring substantial computational resources, particularly for large, pre-trained networks. This work introduces Optimal Spiking Brain Compression (OSBC), a novel, one-shot post-training framework designed to address these limitations and facilitate SNN deployment on resource-constrained platforms. OSBC achieves significant compression in a single pass, streamlining the process and reducing computational overhead compared to existing iterative methods.
OSBC adapts the Optimal Brain Compression (OBC) method, originally developed for Artificial Neural Networks (ANNs), to the specific characteristics of SNNs, drawing on insights from both fields. Pruning reduces network size by systematically removing redundant connections, while quantization reduces the precision of remaining weights and activations, minimising memory footprint and computational cost.
The key innovation within OSBC lies in the loss function used during compression. Unlike traditional methods that minimise loss at the neuron’s input current, OSBC minimises loss on the spiking neuron’s membrane potential – the internal voltage that determines when a neuron fires a signal. This targeted approach appears more effective at preserving performance in SNNs, as the membrane potential directly influences the network’s spiking behaviour and information processing capabilities.
Furthermore, OSBC requires only a small sample dataset for compression, reducing computational burden and making it practical for resource-constrained environments. This contrasts with many existing methods that require large datasets for effective compression.
Experiments were conducted on three benchmark datasets – N-MNIST, CIFAR10-DVS, and DVS128-Gesture – to demonstrate OSBC’s effectiveness. N-MNIST, a neuromorphic version of the MNIST handwritten digit dataset, serves as a foundational testbed for evaluating SNN algorithms. CIFAR10-DVS, a neuromorphic version of the CIFAR10 image dataset, presents a more challenging task with increased complexity and dimensionality. DVS128-Gesture, a dataset of hand gestures captured by a Dynamic Vision Sensor, provides a realistic test case for evaluating SNNs in event-based vision applications.
The framework achieves 97% sparsity – meaning 97% of the network’s connections are removed – across all three datasets, demonstrating its ability to significantly reduce network size without substantial performance loss. This high level of sparsity translates directly into reduced memory footprint and computational complexity.
Across the three datasets, implementation of 4-bit symmetric quantization resulted in minimal accuracy loss, demonstrating the effectiveness of the OSBC approach in preserving performance during compression. Specifically, the accuracy loss was 0.17% on N-MNIST, 1.54% on CIFAR10-DVS, and 7.71% on DVS128-Gesture.
These results demonstrate that OSBC provides a compelling solution for compressing SNNs, enabling their deployment on resource-constrained platforms without significant performance loss. The combination of high sparsity and low-bit quantization results in a substantial reduction in both model size and computational complexity. This makes OSBC a valuable tool for researchers and engineers working on SNN applications in areas such as robotics, computer vision, and edge computing.
Future work will focus on exploring the potential of OSBC in conjunction with other compression techniques, such as weight sharing and knowledge distillation, to further improve compression rates and performance. Investigating the impact of different quantization schemes and pruning strategies on the performance of SNNs will also be a key area of research. Furthermore, exploring the application of OSBC to different SNN architectures and datasets will help to broaden its applicability and demonstrate its versatility. The development of automated tools and frameworks for applying OSBC to SNNs will also be crucial for facilitating its adoption by the broader research community. Ultimately, the goal is to make OSBC a widely accessible and effective tool for compressing SNNs and enabling their deployment in a wide range of applications. Continued development and refinement of OSBC will contribute to the advancement of neuromorphic computing and the realisation of its full potential.
👉 More information
🗞 Optimal Spiking Brain Compression: Improving One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks
🧠 DOI: https://doi.org/10.48550/arXiv.2506.03996
