NVIDIA’s ALCHEMI Toolkit Links with MatGL for Graph-Based MLIPs

NVIDIA’s Justin S. Smith, Nikita Fedik, Dallas Foster, Roman Zubatyuk, and Kelvin Lee have introduced ALCHEMI, a three-layered ecosystem for GPU-accelerated atomistic simulations unveiled at Supercomputing 2024. This system comprises ALCHEMI Toolkit-Ops, offering batched, PyTorch-integrated GPU kernels for core operations like neighbor list construction and long-range electrostatic calculations, alongside ALCHEMI Toolkit and NIM microservices. ALCHEMI Toolkit-Ops is actively integrating with open-source packages, including MatGL, to accelerate graph-based machine learning interatomic potential (MLIP) long-range calculations, addressing limitations in CPU-centric or PyTorch-only approaches and establishing a scalable foundation for future advances in MLIP atomistic simulation on NVIDIA platforms.

NVIDIA ALCHEMI Overview and Layers

NVIDIA ALCHEMI, introduced at Supercomputing 2024, is a three-layered ecosystem designed to accelerate atomistic simulations in chemistry and materials science. This system comprises ALCHEMI Toolkit-Ops, ALCHEMI Toolkit, and ALCHEMI NIM microservices, all optimized for NVIDIA accelerated computing. ALCHEMI addresses a gap in existing software by providing robust, Pythonic tools for GPU-accelerated simulations, specifically targeting the growing field of machine learning interatomic potentials (MLIPs). It aims to overcome performance bottlenecks found in CPU-centric or fragmented software approaches.

ALCHEMI Toolkit-Ops forms the base layer, offering GPU-accelerated, batched common operations crucial for AI-driven atomistic modeling. These operations include neighbor list construction (both O(N) and O(N²) variants), DFT-D3 dispersion correction, and long-range electrostatics calculations (Ewald and PME with cuFFT). Utilizing NVIDIA Warp, Toolkit-Ops provides a modular PyTorch accessible API to facilitate rapid integration with existing simulation packages, and a JAX API is planned for future release.

The ALCHEMI Toolkit-Ops ecosystem is integrating with leading open-source tools like TorchSim, MatGL, and AIMNet Central. TorchSim will utilize optimized neighbor lists for high-throughput batched molecular dynamics, while MatGL will accelerate graph-based long-range interaction calculations. AIMNet Central is leveraging Toolkit-Ops to enhance the performance of its flexible long-range interaction models, demonstrating the platform’s versatility and potential for broad adoption within the research community.

ALCHEMI Toolkit-Ops: Accelerated Operations

NVIDIA ALCHEMI Toolkit-Ops is a foundational layer within the broader ALCHEMI ecosystem, designed to accelerate atomistic simulations. It provides a repository of GPU-accelerated, batched common operations crucial for AI-enabled modeling, including neighbor list construction (both O(N) and O(N²) variants), DFT-D3 dispersion corrections, and long-range electrostatic calculations (Ewald and PME with cuFFT acceleration). This toolkit addresses the performance limitations of CPU-centric or PyTorch-only approaches in materials science and chemistry research.

ALCHEMI Toolkit-Ops is built for seamless integration with existing PyTorch-based workflows, featuring a modular, PyTorch-accessible API. Current integrations include TorchSim, which leverages the toolkit’s optimized neighbor lists for high-throughput batched molecular dynamics, and MatGL, which accelerates graph-based treatments of long-range interactions. AIMNet Central is also utilizing the toolkit to enhance its flexible long-range interaction models, demonstrating broad applicability.

Performance benchmarks, utilizing ammonia clusters and an NVIDIA H100 80 GB GPU, demonstrate the speed of ALCHEMI Toolkit-Ops compared to models like MACE and TensorNet. The toolkit’s accelerated kernels for neighbor lists and DFT-D3 calculations achieve fully parallelized performance and scalability, addressing a key bottleneck in computationally intensive atomistic simulations. Results were averaged over 20 runs for accuracy.

For large periodic systems, Ewald-based methods separate electrostatic interactions into short-range and long-range components, each computed in the domain best suited for performance.

Addressing Performance Gaps in Atomistic Simulation

NVIDIA ALCHEMI addresses performance gaps in atomistic simulation by introducing a three-layered ecosystem designed for GPU acceleration. Specifically, ALCHEMI Toolkit-Ops delivers high-throughput, batched GPU kernels for core operations like neighbor list construction (O(N) and O(N²) variants), DFT-D3 dispersion correction, and long-range electrostatics (Ewald and PME with cuFFT). This tackles the limitations of CPU-centric or purely PyTorch-based approaches which struggle to deliver the speed needed for contemporary research, particularly with MLIPs.

ALCHEMI Toolkit-Ops’ performance gains are demonstrated through benchmarks comparing it to models like MACE and TensorNet on ammonia clusters, tested on an NVIDIA H100 80 GB GPU. The toolkit provides accelerated kernels for key operations, with the aim of achieving fully parallelized performance and scalability. These optimizations are crucial because traditional software for MLIP-driven simulations often relies on CPU computation for operations like neighbor identification and dispersion corrections, creating bottlenecks.

Integration with existing open-source tools is a key component of ALCHEMI’s approach. ALCHEMI Toolkit-Ops is being integrated with TorchSim (enabling batched GPU molecular dynamics), MatGL (accelerating graph-based long-range calculations), and AIMNet Central (enhancing flexible long-range interactions). This seamless integration within the PyTorch ecosystem allows researchers to leverage ALCHEMI’s performance enhancements without completely overhauling existing workflows.

MLIPs: Transforming Chemistry and Materials Science

NVIDIA ALCHEMI addresses a critical gap in chemistry and materials science: a lack of robust, GPU-accelerated tools for machine learning interatomic potential (MLIP)-driven simulations. Traditionally, these simulations have relied on CPU-centric software, hindering performance. ALCHEMI establishes a three-layered ecosystem – Toolkit-Ops, Toolkit, and NIM microservices – optimized for NVIDIA platforms. This aims to combine the accuracy of quantum chemistry with the scaling power of AI, enabling more efficient atomistic simulations.

ALCHEMI Toolkit-Ops, the initial release, provides high-throughput, batched, and PyTorch-integrated GPU kernels for core operations. These include neighbor list construction (both O(N) and O(N²) variants), DFT-D3 dispersion correction, and long-range electrostatics calculations (Ewald and PME). Benchmarks demonstrate that ALCHEMI Toolkit-Ops outperforms popular kernel-accelerated models like MACE and TensorNet, achieving fully parallelized performance and scalability on NVIDIA H100 GPUs.

ALCHEMI Toolkit-Ops is designed for seamless integration with existing open-source tools. Current integrations include TorchSim (for batched molecular dynamics), MatGL (accelerating graph-based MLIP long-range calculations), and AIMNet Central (enhancing flexible long-range interaction modeling for AIMNet2). This modularity, combined with the accelerated kernels, establishes a scalable and performant foundation for advancing MLIP-based atomistic simulations.

Integration with Open Source Simulation Packages

NVIDIA ALCHEMI Toolkit-Ops is actively being integrated with leading open source packages to enhance atomistic simulations. These integrations include TorchSim, a PyTorch-native engine enabling batched molecular dynamics, and MatGL, a framework for graph-based machine learning interatomic potentials. By leveraging optimized neighbor lists from Toolkit-Ops, both packages aim to achieve high-throughput operations without sacrificing flexibility or performance in simulations of materials and molecules.

ALCHEMI Toolkit-Ops is designed for seamless integration within the existing PyTorch-based atomistic simulation ecosystem. It provides GPU-accelerated, batched common operations—like neighbor list construction, DFT-D3 dispersion correction, and long-range electrostatics—exposed through a modular PyTorch API. This allows developers to rapidly integrate these accelerated kernels into their existing workflows and future atomistic simulation packages, improving computational efficiency.

AIMNet Central, a repository for the MLIP AIMNet2, is also utilizing ALCHEMI Toolkit-Ops to improve its capabilities. Specifically, AIMNet Central is leveraging the toolkit to enhance the performance of its flexible long-range interaction models, utilizing NVIDIA-accelerated DFT-D3 calculations. This demonstrates the broad applicability of Toolkit-Ops across diverse MLIP frameworks and simulation methodologies.

Performance Benchmarks of ALCHEMI Toolkit-Ops

NVIDIA ALCHEMI Toolkit-Ops is a foundational layer within the larger ALCHEMI ecosystem, designed to accelerate batched, GPU-accelerated common operations for AI-enabled atomistic simulations. It addresses performance bottlenecks in traditional CPU-centric workflows by providing optimized kernels for critical tasks like neighbor list construction (both O(N²) and O(N) variants), DFT-D3 dispersion correction, and long-range electrostatics (Ewald and PME). These operations are exposed through a modular, PyTorch-accessible API, facilitating rapid integration with existing simulation packages.

Performance benchmarks, conducted on an NVIDIA H100 80 GB GPU using ammonia clusters, demonstrate the speed of ALCHEMI Toolkit-Ops compared to models like MACE and TensorNet. Results, averaged over 20 runs, showcase accelerated performance for neighbor lists and DFT-D3 calculations. These tests are critical because they show ALCHEMI Toolkit-Ops enables fully parallelized performance and scalability for atomistic simulations, resolving limitations of hybrid workflows where models are GPU-accelerated but tooling remains CPU-bound.

ALCHEMI Toolkit-Ops is being actively integrated with leading open-source packages to expand its reach and impact. Collaborations with TorchSim, MatGL, and AIMNet Central aim to enhance their capabilities. TorchSim will leverage optimized neighbor lists for high-throughput batched molecular dynamics, while MatGL will accelerate graph-based long-range interactions. AIMNet Central is using the toolkit to further enhance flexible long-range interaction models with NVIDIA-accelerated DFT-D3.

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

December 29, 2025
Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

December 28, 2025
Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

December 27, 2025