Many scientific computations involve diverse subtasks with differing computational needs, demanding efficient scheduling to maximise hardware utilisation. Anton Reinhard, Simeon Ehrig, and colleagues at Helmholtz-Zentrum Dresden-Rossendorf present a new software framework, built in the Julia programming language, that automatically generates optimised code for these complex problems. The team’s approach employs directed acyclic graphs to represent computational tasks, and crucially, incorporates domain-specific knowledge into the scheduling process. This allows for significant improvements beyond standard graph scheduling techniques, demonstrated through an application involving the calculation of complex scattering processes in electrodynamics, paving the way for faster and more efficient scientific computing.
Julia for Scientific and High-Performance Computing
This research explores the use of the Julia programming language for high-performance computing in computationally intensive scientific applications. The work focuses on developing tools and techniques to maximize performance on modern hardware, including CPUs and GPUs, by representing complex calculations as directed acyclic graphs (DAGs) to enable optimization and parallelization. The team leverages specialized Julia libraries for GPU programming, performance analysis, and runtime code generation, addressing challenges in areas like quantum field theory and particle physics. This work demonstrates the potential of Julia to accelerate scientific discovery through optimized code and efficient hardware utilization.
Automated Code Generation for Heterogeneous Hardware
Scientists have developed a novel software framework, implemented in Julia, that automatically generates statically scheduled and compiled code for complex computational problems. Recognizing that many scientific calculations consist of subtasks with varying computational demands, the team engineered a system that analyzes these subtasks and assigns them to the most suitable hardware available. This approach utilizes directed acyclic graphs (DAGs) to represent problems and maximize hardware utilization, incorporating both domain-specific information and theoretical concepts to enable optimizations beyond conventional methods. The framework targets a wide range of architectures, including CPUs and GPUs, with the goal of achieving hardware agnostic code execution.
The system relies on representing computations as Computable DAGs (CDAGs), requiring a user-provided generator to construct the complete graph, followed by analysis, optimization, and compilation stages that are domain-agnostic. Individual nodes within the CDAG represent reusable functions, enhancing optimization potential, and the scheduler prioritizes functions with predictable runtimes to distribute workload evenly across available hardware. The framework was demonstrated on a perturbative quantum electrodynamics (QED) application, showcasing its ability to handle problems of extreme scale and complexity.
CDAG Compilation Optimizes Electrodynamic Matrix Elements
Scientists have developed a software framework, implemented in Julia, capable of automatically generating statically scheduled and compiled code from computations represented as Computable Directed Acyclic Graphs, or CDAGs. This work introduces a novel approach to computation by combining graph theory with domain-specific knowledge, enabling optimizations previously unattainable. The core innovation lies in the creation of CDAGs, which represent computations as a network of tasks and data dependencies, allowing for precise scheduling and efficient hardware utilization. The team demonstrated the effectiveness of this framework by applying it to the complex calculation of matrix elements for scattering processes in electrodynamics, finding that significant optimizations are possible due to the reuse of intermediate results, particularly when summing over different spin and polarization states.
The static topology of the CDAG, representing the entire scattering process, is ideally suited for this approach, leveraging the fact that the compute kernels are pure functions and execute in constant time. The software, named ComputableDAGs. jl, consists of interacting modules designed for extensibility, with a key component being the generator, which creates a CDAG from a domain-specific model. Estimators predict the computational cost of the CDAG, and optimizers then refine the graph to minimize this cost, culminating in the generation of executable Julia code.
GPU Acceleration of Scattering Calculations
This research presents a software framework, developed in Julia, that automatically generates statically scheduled and compiled code from directed acyclic graphs (DAGs), extending existing DAG scheduling concepts with domain-specific information. The team demonstrated the framework’s capabilities by applying it to the complex task of computing matrix elements for scattering processes involving many particles in electrodynamics, achieving significant speedups when executing these computations on a graphics processing unit (GPU) compared to a central processing unit (CPU). For smaller scattering processes, GPU execution proved to be between 250 and 400times faster, with larger processes showing a consistent speedup factor of around 200, aligning with theoretical performance expectations. These results confirm the effectiveness of CDAG-level optimizations, specifically node reduction, which merges redundant computations to improve efficiency. While function compilation time currently represents a bottleneck, future work will focus on improving this process to enable scaling to even larger and more complex computations.
👉 More information
🗞 Optimizations on Graph-Level for Domain Specific Computations in Julia and Application to QED
🧠 ArXiv: https://arxiv.org/abs/2511.19456
