Robotic rehabilitation holds great promise for helping patients recover from stroke, but current systems often lack the adaptability needed to respond to individual progress and operate effectively outside of clinical settings. To address these limitations, Phani Pavan Kambhampati, Chainesh Gautam, Jagan Palaniswamy, and Madhav Rao from the International Institute of Information Technology Bangalore present NeuRehab, a novel framework that intelligently automates rehabilitation exercises. This system uniquely combines reinforcement learning with ultra-low power spiking neural networks, distributing computation between mobile and stationary hardware to optimise both performance and energy efficiency. NeuRehab achieves significant improvements over existing approaches, delivering comparable exercise performance with over 60% reductions in both power consumption and control latency, paving the way for more effective and accessible rehabilitation therapies.
Spiking Networks Enable Efficient Robotic Control
The document presents research focused on applying spiking neural networks (SNNs) to reinforcement learning for robotic control and rehabilitation applications. The overarching goal is to develop energy-efficient, low-latency control systems that can potentially be deployed on neuromorphic hardware. The work investigates multiple strategies for training SNNs, converting deep reinforcement learning (DRL) policies into spiking models, and optimizing SNN architectures to achieve reliable performance in real-time robotic scenarios, particularly in healthcare and rehabilitation settings.
A primary application highlighted in the document is robotic control for rehabilitation, especially for assisting stroke patients. The proposed systems aim to support both upper- and lower-limb movement training using robotic arms. SNNs are explored as a means to process electromyography (EMG) signals, enabling the decoding of a patient’s intended movements and translating them into robotic actions. Emphasis is placed on adaptability and personalization, allowing the system to adjust to individual patient needs and rehabilitation progress.
Training SNNs using reinforcement learning is identified as a major challenge due to the non-differentiable nature of spike events, which complicates traditional gradient-based optimization. To address this, the research explores several approaches, including surrogate gradient learning, where differentiable approximations of spike functions are used, as well as direct training methods that avoid conversion from artificial neural networks (ANNs). Evolutionary algorithms are also considered as an alternative training strategy, and temporal coding schemes are investigated to encode information in spike timing rather than firing rates.
Another significant focus of the research is the conversion of existing DRL policies into SNNs. This approach leverages the maturity and robustness of established DRL algorithms by first training policies using conventional ANNs and then translating them into spiking equivalents. Various conversion techniques are examined, such as direct weight mapping, rate coding, temporal coding, and normalization and scaling methods to ensure stable and effective SNN behavior after conversion.
The document also discusses optimization techniques and architectural innovations for SNNs aimed at improving efficiency and responsiveness. These include methods such as Slayer for improved error reassignment during training, temporal pruning to reduce unnecessary spikes and latency, and optimization of neuron leakage and firing thresholds. Early-exit network designs are explored to allow predictions to be made before all layers are processed, and pre-charging membrane potentials is proposed as a way to further reduce response time.
Energy efficiency and low latency are central motivations throughout the research. SNNs are highlighted for their potential to significantly reduce power consumption, particularly when implemented on neuromorphic hardware, while still meeting the real-time requirements of robotic control systems. The document references tools and frameworks such as Gymnasium, CleanRL, PyTorch, SpyTorch, and SEENN to support experimentation and implementation.
Overall, the research demonstrates that SNNs can be effectively applied to robotic control and rehabilitation tasks, and that converting DRL policies into spiking models is a viable, though non-trivial, approach. While promising results are reported in terms of efficiency and latency, the document also acknowledges limitations, including the difficulty of training SNNs, the need for more scalable and robust training methods, and the necessity of further evaluation on real-world robotic systems. Future work is directed toward improving SNN architectures, exploring alternative coding schemes, and validating performance in practical deployment scenarios.
NeuRehab Framework For Wheelchair Robotic Rehabilitation
The research team engineered NeuRehab, a novel framework for robotic rehabilitation that separates computationally intensive training from low-power, on-device inference, specifically for wheelchair-based exoskeletal systems. This system builds upon the existing XoRehab platform and introduces a dedicated hardware and software architecture, enabling efficient learning and deployment of control policies. A key innovation lies in the division of labour, with a docking station housing powerful GPUs for fine-tuning behavioural models using data gathered from the wheelchair’s edge device, while the wheelchair itself executes the learned policies using a neuromorphic chip., To facilitate safe development and thorough analysis, the scientists designed two custom simulation environments, the Kinematic Environment (KENV) and the Dynamic Environment (DENV), both adhering to the Gymnasium API. KENV models the discrete motion profiles of stepper motors and their associated delays, while DENV simulates torque-driven pendulum-like physics, incorporating patient interaction.
Both environments provide clinically relevant observations, including joint angles, velocities, patient torque, and strain, and employ reward terms that penalise excessive force, abrupt movements, and misalignment between the patient and the device., The team implemented Soft Actor-Critic (SAC) as the base reinforcement learning method, introducing a heterogeneous training scheme called Hybrid-SAC (HSAC). In HSAC, the policy, or actor, is implemented as a spiking neural network (SNN), while the critic remains an artificial neural network (ANN), maintaining high-precision value estimation for learning stability and aligning the actor architecture with the neuromorphic hardware. Furthermore, the scientists developed two inference-time optimisations for spiking control policies: Spiking Post-Training Temporal Quantisation (SPTTQ), which treats the number of spiking time steps as a post-training quantisation parameter, and Sequent Leaky (SLeaky) neurons, which retain membrane potential across reinforcement learning steps, reducing charge-up overhead. These combined innovations achieve over 60% savings in both power and latency during inference compared to standard implementations, while maintaining comparable performance uplifts.
NeuRehab Framework Boosts Robotic Stroke Recovery
Recent research delivers a novel robotic rehabilitation framework, NeuRehab, designed to enhance recovery for post-stroke patients. This work addresses limitations in existing modular exercise systems by introducing an end-to-end framework integrating artificial intelligence and co-designed control systems. The system optimises both performance and power consumption, enabling mobile operation without compromising functionality. NeuRehab consists of two key partitions: a wheelchair-based edge device and a stationary docking station, allowing for efficient resource allocation and learning., The framework achieves over 60% savings in both power and latency during inference compared to standard implementations, while maintaining comparable performance.
This is accomplished through a split machine learning process, dividing computational tasks between the wheelchair and docking station. The stationary dock, equipped with powerful hardware, handles computationally intensive tasks like model fine-tuning, while the wheelchair utilises ultra-low power spiking networks for real-time control. Task-specific temporal optimisations further reduce edge-inference control latency, ensuring responsive and efficient operation., Experiments validate the framework on a shoulder exercise using the XoRehab platform, an IoT-enabled rehabilitation system. The shoulder joint operates using a stepper motor with a 1:15 gearbox, providing position-hold functionality and requiring no power to maintain a specific state.
The system incorporates force sensors to provide feedback on the patient’s ability to perform the exercise, replacing reliance on potentially noisy bio-signals like electromyograms. The research demonstrates a reinforcement learning architecture capable of adapting to patient interaction, modulating speed, and avoiding excessive force, ultimately aiding in flexion and extension exercises. This innovative approach enables the system to tune its response to individual patient needs, offering a promising advancement in robotic rehabilitation technology.
Spiking Networks Enable Resource-Aware Rehabilitation
NeuRehab represents a significant advance in robotic rehabilitation, delivering an end-to-end framework that integrates reinforcement learning and spiking neural networks for autonomous, resource-aware exoskeletons. The team developed a system grounded in the XoRehab platform, alongside novel simulation environments designed to accurately model shoulder joint movement and clinically relevant constraints like minimising strain and providing assistance only when needed. A key algorithmic innovation is Hybrid-SAC, which combines a spiking neural network actor with an artificial neural network critic, achieving a balance between precision, learning efficiency, and compatibility with neuromorphic hardware., Evaluations across standard benchmarks and custom simulation environments demonstrate that Hybrid-SAC reliably matches the performance of conventional artificial neural network approaches, while exceeding the capabilities of fully spiking network systems. Further algorithmic contributions, including Spiking Post-Training Temporal Quantisation and the Sequent Leaky neuron, optimise performance by reducing spike counts by up to 63% and improving latency by over 60% during control tasks, all while maintaining reward levels comparable to those achieved with conventional systems. The researchers acknowledge that current transfer learning approaches are necessary when adapting the system to new patient profiles, suggesting that incorporating a baseline control profile could further enhance adaptability. Future work will focus on extending the framework to additional joints within the XoRehab system and broadening the range of human torque profiles used for training, while the underlying software’s adaptability also suggests potential applications in other time and power-constrained environments like industrial machinery.
👉 More information
🗞 NeuRehab: A Reinforcement Learning and Spiking Neural Network-Based Rehab Automation Framework
🧠 ArXiv: https://arxiv.org/abs/2512.17841
