The increasing volume of data generated by advanced imaging techniques, such as X-ray tomography and microscopy, presents significant challenges for efficient processing and analysis. Camila Machado de Araujo, Egon P. B. S. Borges, Ricardo Marcelo Canteiro Grangeiro, and colleagues at the Brazilian Synchrotron Light Laboratory, address this issue by introducing Harpia, a new CUDA-accelerated library that expands the capabilities of the Annotat3D software. Harpia enables scalable, interactive segmentation of extremely large 3D datasets, overcoming limitations imposed by single-GPU memory capacity through strict memory control and efficient processing. This advancement delivers substantial improvements in processing speed, memory efficiency, and scalability compared to existing frameworks, and promises to facilitate collaborative scientific imaging workflows in shared high-performance computing environments.
Harpia, a GPU Library for Volumetric Data
Scientists developed Harpia, a new processing library built on CUDA and integrated into Annotat3D, to overcome challenges in handling increasingly large datasets from high-resolution volumetric imaging techniques like X-ray tomography and advanced microscopy. This work prioritizes scalable, interactive workflows suitable for both powerful computing environments and remote access. Harpia carefully manages memory and utilizes a native chunked execution approach, enabling reliable operation on datasets that exceed the capacity of a single GPU, a critical advancement for modern experimental facilities generating massive data volumes. The system incorporates a comprehensive suite of GPU-accelerated tools designed for efficient image processing and analysis, including high-performance filtering techniques such as Unsharp Mask, Anisotropic Diffusion, Median, and Non-Local Means, all optimized for CUDA to accelerate pre- and post-processing steps.
A new label editing module, fully accelerated by CUDA, provides 3D morphological operations, thresholding methods, and island removal, alongside a 2. 5D Watershed implementation for precise segmentation refinement. For post-segmentation analysis, the team developed CUDA-accelerated quantification tools to compute volume, area, perimeter, and fraction, alongside connected components labeling and Euclidean Distance Transform via OpenMP. Extended annotation tools, such as Magic Wand and Lasso, further enhance usability, leveraging accelerated 2D algorithms including CUDA-accelerated implementations of active contours (Snakes) and morphological operations. This combination of tools and techniques delivers significant improvements in processing speed, memory efficiency, and scalability compared to established frameworks like cuCIM and scikit-image. The interactive, human-in-the-loop interface, combined with efficient GPU resource management that releases memory upon task completion, makes the system particularly well-suited for collaborative scientific imaging workflows in shared HPC infrastructures.
Harpia Accelerates Terabyte-Scale 3D Data Processing
Scientists have developed Harpia, a new processing library built on CUDA and integrated with Annotat3D, designed to significantly enhance the processing of large 3D datasets generated by techniques like X-ray tomography and advanced microscopy. This work addresses the challenges posed by datasets reaching terabytes in size, often produced in just a few hours during a single experiment, by enabling scalable, interactive segmentation workflows in high-performance computing environments. The core of this advancement lies in a scalable, chunked-based processing architecture, allowing efficient handling of datasets exceeding the capacity of a single GPU. Harpia features efficient GPU resource management, releasing memory and computational resources after task completion to support concurrent multi-user workflows on shared HPC infrastructure. Extended annotation tools, including Magic Wand and Lasso, coupled with accelerated 2D algorithms like active contours (Snakes), morphological operations, and thresholding, provide intuitive and efficient segmentation refinement. This combination of advancements delivers a comprehensive solution for processing and analyzing large-scale volumetric datasets, enabling detailed insights into internal structures at micro and nanoscale levels.
Scalable Volumetric Segmentation with Harpia
This work presents Harpia, a new processing library integrated into Annotat3D, designed to address challenges in interactive segmentation of large volumetric datasets generated by techniques such as X-ray tomography and advanced microscopy. Harpia employs strict memory control and a native chunked execution approach, enabling scalable processing of datasets that exceed the capacity of single-GPU memory. Benchmarking demonstrates that Harpia outperforms established frameworks in terms of scalability, memory efficiency, and processing speed, offering significant improvements for scientific imaging workflows. The system’s web-based interface and integrated resource management features make it particularly well-suited for multi-user and remote-access environments commonly found in facilities like synchrotron and microscopy centres. While acknowledging potential limitations related to resource contention, the authors highlight Harpia’s production-ready solution for stable, large-scale processing. Future development plans include extending Harpia to support multi-GPU and heterogeneous computing architectures, as well as integrating advanced visual models to further enhance segmentation accuracy and interactivity in complex scientific imaging tasks.
👉 More information
🗞 Advancing Annotat3D with Harpia: A CUDA-Accelerated Library For Large-Scale Volumetric Data Segmentation
🧠 ArXiv: https://arxiv.org/abs/2511.11890
