Scientists are increasingly focused on developing more robust and model-free control systems for the complex task of spacecraft rendezvous and docking. Shibo Shao, Dong Zhou, and Guanghui Sun, from the Department of Control Science and Engineering at Harbin Institute of Technology, alongside Liwen Zhang and Minxuan Jiang, present a novel Imitation Learning-based framework , IL-SRD , that learns directly from expert demonstrations, minimising reliance on precise dynamic modelling. Their research, detailed in this paper, introduces an ‘anchored decoder target’ to ensure physically plausible control and incorporates temporal aggregation to improve stability over extended prediction horizons. By demonstrating accurate, energy-efficient and robust performance in simulations , even under significant disturbances , this work represents a significant step towards more reliable autonomous spacecraft operations.
Anchored Decoder Target for Spacecraft Docking improves precision
Scientists have developed a novel Imitation Learning-based spacecraft rendezvous and docking control framework (IL-SRD) that significantly reduces reliance on precise spacecraft modelling. This breakthrough addresses limitations in existing methods which often struggle with robustness in realistic on-orbit environments, instead learning directly from expert demonstrations. The research team achieved this by implementing an ‘anchored decoder target’, a mechanism that conditions control generation on state-related anchors, enforcing physically consistent control evolution and suppressing implausible action deviations during sequential prediction. This innovation enables reliable six-degree-of-freedom (6-DOF) rendezvous and docking control, a critical capability for numerous space missions.
The study unveils a new approach to long-horizon control problems, employing a Transformer-based architecture to model temporal dependencies in control sequence generation. Researchers incorporated a temporal aggregation mechanism to mitigate error accumulation, a common issue in sequential prediction where small inaccuracies can amplify over time. This design ensures smoother and more stable control execution throughout the entire rendezvous and docking process, enhancing long-term stability and preventing potentially catastrophic deviations. Extensive simulation results demonstrate that the IL-SRD framework achieves accurate and energy-efficient model-free rendezvous and docking control, outperforming classical controllers and existing Deep Reinforcement Learning approaches.
Experiments show the proposed method maintains competitive performance even under significant unknown disturbances, highlighting its robustness. The team’s innovation lies in directly learning a near-optimal 6-DOF control policy from expert demonstrations, bypassing the need for complex and often inaccurate dynamic models. By explicitly constraining predicted action sequences with state-related anchors, the anchored decoder target ensures physical consistency and actuator feasibility, preventing unsafe or irrational control commands. This work opens new avenues for autonomous spacecraft operations, potentially reducing operational costs and improving mission success rates for tasks like space station assembly, on-orbit servicing, and space debris removal.
Furthermore, the research establishes a foundation for more resilient and adaptable spacecraft control systems, capable of operating effectively in unpredictable space environments. The source code for the IL-SRD framework is publicly available, facilitating further research and development in this critical area of space technology. This advancement represents a significant step towards fully autonomous spacecraft rendezvous and docking, paving the way for more complex and ambitious space missions in the future.
Anchored Decoder for Spacecraft Rendezvous Control
Scientists developed an Imitation Learning-based spacecraft rendezvous and docking (IL-SRD) framework to overcome limitations inherent in model-dependent control methods. This research pioneers a control system that directly learns policies from expert demonstrations, minimising reliance on precise spacecraft modelling and enhancing robustness in realistic on-orbit environments. The team engineered a novel anchored decoder target mechanism, conditioning decoder queries on state-related anchors to explicitly constrain the control generation process and enforce physically consistent control evolution. This innovative approach effectively suppresses implausible action deviations during sequential prediction, enabling reliable six-degree-of-freedom (6-DOF) rendezvous and docking control.
Researchers implemented a Transformer-based model as the core of the IL-SRD framework, leveraging its capacity for sequential data processing to predict optimal control actions. To mitigate error accumulation, a common issue in sequential prediction where small inaccuracies amplify over time, the study pioneered a temporal aggregation mechanism. This technique effectively integrates information across multiple time steps, stabilising the control process and improving long-horizon performance. Experiments employed extensive simulations to validate the IL-SRD framework, demonstrating accurate and energy-efficient model-free rendezvous and docking control capabilities.
The study harnessed simulation environments to generate expert demonstration data, providing the training data for the imitation learning process. Control policies were then learned by minimising the difference between the predicted control actions and the expert demonstrations, effectively transferring the expert’s knowledge to the autonomous system. Robustness evaluations were conducted under significant unknown disturbances, confirming the IL-SRD framework’s ability to maintain competitive performance even in challenging conditions. The system delivers a significant advancement in autonomous spacecraft control, offering a viable alternative to traditional methods and paving the way for more reliable and efficient on-orbit operations. The source code is publicly available at https://github. Experiments revealed the IL-SRD framework successfully manages six-degree-of-freedom (6-DOF) rendezvous and docking, demonstrating a breakthrough in autonomous spacecraft control. The team measured performance through extensive simulations, confirming the framework’s ability to maintain competitive performance even under significant, unknown disturbances.
A key innovation was the implementation of an ‘anchored decoder target’, which conditions control generation on state-related anchors, effectively constraining the control process and enforcing physically consistent action evolution. Measurements confirm this mechanism suppresses implausible action deviations during sequential prediction, crucial for reliable long-horizon control. This anchored decoder directly addresses the challenges of maintaining stability throughout the complex docking procedure. Further enhancing stability, researchers incorporated a temporal aggregation technique to mitigate error accumulation inherent in Transformer-based models.
Data shows that small inaccuracies, which typically propagate and amplify over time, were effectively minimized, leading to smoother and more reliable control execution. The IL-SRD framework demonstrably outperforms classical controllers, Deep Reinforcement Learning (DRL) approaches, and standard Behavioral Cloning (BC) methods in simulation. Tests prove the effectiveness of the proposed method in handling strongly coupled, long-horizon control problems typical of spacecraft rendezvous. The study details how the framework learns a near-optimal 6-DOF control policy by directly capturing temporal dependencies from expert demonstrations. Results demonstrate the framework’s capability to address the limitations of traditional methods, which often struggle with model inaccuracies and communication latency, paving the way for more robust and autonomous on-orbit missions. The source code is publicly available, facilitating further research and development in this critical area of space exploration.
IL-SRD boosts spacecraft docking via learning
Scientists have developed a new imitation learning framework for spacecraft rendezvous and docking control, reducing reliance on precise dynamic modelling. This innovative approach, termed IL-SRD, learns control policies directly from expert demonstrations, enabling more robust and adaptable performance in space environments. The framework incorporates an ‘anchored decoder target’ which constrains control generation by referencing state-related anchors, ensuring physically plausible actions and consistent evolution over time. Furthermore, a temporal aggregation mechanism mitigates error accumulation inherent in sequential prediction models, smoothing out inaccuracies and enhancing overall control stability.
Extensive simulations demonstrate that IL-SRD achieves comparable performance to expert demonstrations, while also exhibiting strong robustness against unknown disturbances. However, the authors acknowledge limitations in achieving extremely high terminal precision required for close-proximity operations, and the current reliance on state-to-state prediction with low-level servo control may introduce errors. Future research will focus on improving terminal control precision and transitioning to a direct state-to-action formulation for more responsive docking maneuvers.
👉 More information
🗞 Imitation learning-based spacecraft rendezvous and docking method with Expert Demonstration
🧠 ArXiv: https://arxiv.org/abs/2601.12952
