Reconstructing three-dimensional scenes from limited data, such as that captured by time-resolved imaging systems, remains a significant challenge, particularly in complex environments or at long distances. Yue Li, Shida Sun, and Yu Hong, alongside Feihu Xu and Zhiwei Xiong, address this problem by introducing a novel Time-Resolved Transformer (TRT) architecture. This new approach significantly improves 3D reconstruction performance in photon-efficient imaging by effectively processing both spatial and temporal correlations within transient measurements, unlike existing transformers designed for different types of data. The team demonstrates that TRT, embodied in both line-of-sight (LOS) and non-line-of-sight (NLOS) imaging systems, consistently outperforms current methods on both simulated and real-world data, and they further contribute a large, high-resolution synthetic dataset and new real-world measurements to advance research in this field.
Imaging Beyond Line of Sight with Photons
Recent research focuses on advanced imaging technologies, particularly non-line-of-sight (NLOS) imaging, which reconstructs images around corners or through obstructions. A key element of this progress involves single-photon imaging, utilizing highly sensitive sensors to detect individual photons and maximize imaging efficiency. Computational imaging techniques play a crucial role, employing algorithms to enhance image quality and overcome limitations of traditional optics. These advancements are driving progress in capturing transient events and creating detailed 3D reconstructions. Scientists are increasingly leveraging deep learning, including convolutional neural networks (CNNs) and transformers, to process and interpret complex image data.
U-Net architectures are commonly used for image segmentation, while attention mechanisms help focus on important image features. Graph neural networks are also being applied to image restoration, and neural fields offer a novel way to represent scenes. This combination of advanced algorithms and powerful hardware is enabling breakthroughs in image processing, including denoising, inpainting, and super-resolution. Researchers are also exploring mathematical and statistical concepts, such as random point processes and correlation clustering, to improve image analysis. Tensor decomposition provides a powerful tool for representing and processing complex data. These techniques, combined with physics-informed deep learning, allow scientists to incorporate physical models of light transport into imaging algorithms, further enhancing performance. The convergence of these disciplines is leading to more efficient and accurate imaging systems, particularly in challenging scenarios where light is limited or scattered.
Transformer Architecture for 3D Transient Reconstruction
Scientists have developed a novel Time-Resolved Transformer (TRT) architecture to significantly enhance 3D reconstruction from photon-efficient transient measurements. This system addresses limitations in existing methods for low signal-to-background ratio and sparse data by capturing both local and global spatio-temporal correlations within the data. The core of the TRT involves elaborately designed attention mechanisms, specifically spatio-temporal self-attention (STSA) and spatio-temporal cross attention (STCA), which enable effective processing of complex relationships within the data. The team extracts shallow features from transient measurements obtained using a time-correlated single-photon counting (TCSPC) sensor coupled with a pulsed laser and a single-photon avalanche diode (SPAD).
This system captures returning photons and records their time-of-arrival during each pulse cycle, accumulating data to form a spatio-temporal representation of the scene. The STSA module then processes these features by splitting or downsampling input data into different scales, allowing the system to explore correlations across varying spatial and temporal resolutions. Subsequently, the STCA module integrates both local and global features within a token space, generating deep features with enhanced representation capabilities. Building upon the TRT architecture, scientists developed two task-specific embodiments: TRT-LOS for line-of-sight imaging and TRT-NLOS for non-line-of-sight imaging.
To validate performance, the team created a large-scale, high-resolution synthetic line-of-sight dataset with varying noise levels, providing a robust benchmark for evaluation. Furthermore, they captured a set of real-world non-line-of-sight measurements using a custom-built imaging system, expanding the diversity of available data. Extensive experiments demonstrate that both TRT-LOS and TRT-NLOS significantly outperform existing methods on both synthetic and real-world data, demonstrating the effectiveness of the proposed approach in challenging imaging conditions. To improve NLOS reconstruction, scientists designed a transient measurement denoiser for TRT-NLOS, substantially enhancing the quality of input data and subsequent reconstruction performance.
Transformer Architecture Improves 3D Transient Reconstruction
Scientists have developed a novel Time-Resolved Transformer (TRT) architecture to significantly enhance 3D reconstruction from transient measurements, particularly in challenging photon-efficient imaging scenarios. This work addresses limitations in both line-of-sight (LOS) and non-line-of-sight (NLOS) imaging, where low signal levels and high noise traditionally hinder accurate 3D reconstruction, especially over long distances or in complex scenes. The TRT architecture incorporates two sophisticated attention designs that effectively exploit both local and global correlations within the spatio-temporal transient data, leading to improved performance. Experiments demonstrate that the TRT-based methods, specifically TRT-LOS for LOS imaging and TRT-NLOS for NLOS imaging, consistently outperform existing techniques on both synthetic and real-world data.
The team achieved state-of-the-art results, demonstrating robust generalisation across different imaging systems. To further advance the field, researchers introduced a large-scale synthetic dataset and real-world measurements to further advance research in this field. The results of this work represent a significant step forward in photon-efficient 3D imaging, opening new possibilities for applications requiring robust and accurate reconstruction in challenging environments.
👉 More information
🗞 3D Reconstruction from Transient Measurements with Time-Resolved Transformer
🧠 ArXiv: https://arxiv.org/abs/2510.09205
