Scientists are tackling the computational challenges of modelling complex 3D atomistic systems with a new approach to Equivariant Graph Neural Networks (EGNNs). Lin Huang, Chengxiang Huang, and Ziang Wang, all from 1IQuest Research, alongside colleagues Du, Wang et al, present E2Former-V2, a scalable model that significantly improves efficiency by integrating algebraic sparsity with hardware-aware execution. This research is significant because it overcomes critical scalability bottlenecks found in existing EGNNs, which often struggle with the computational cost of dense tensor products , E2Former-V2 achieves a remarkable 20x improvement in TFLOPS. Through innovations like Equivariant Axis-Aligned Sparsification and On-the-Fly Equivariant Attention, the team demonstrates that large equivariant models can be trained efficiently on standard GPU platforms, paving the way for faster and more accurate molecular simulations.
Traditional EGNNs struggle with scalability due to the explicit construction of geometric features and dense tensor products calculated on every edge, hindering their application to larger systems. This new research addresses these bottlenecks by integrating algebraic sparsity with hardware-aware execution, paving the way for efficient simulations of complex molecular structures. By eliminating the need to materialize edge tensors and maximizing SRAM utilization, this kernel achieves a remarkable 20× improvement in TFLOPS compared to standard implementations. This optimisation is crucial, as traditional EGNNs suffer from severe latency due to their edge-centric nature, while standard Transformers have previously addressed similar issues through hardware-aligned execution. Extensive experiments conducted on the SPICE and OMol25 datasets demonstrate that E2Former-V2 maintains comparable predictive performance to existing methods while significantly accelerating inference speeds.
The study reveals that E2Former-V2 achieves O(|V|) activation memory, preserving theoretical exactness through its innovative design. This design combines the SO(2) rotational basis for sparsification with the custom equivariant attention kernel, enabling efficient computation of interactions via a fused Triton kernel. As illustrated in accompanying figures, E2Former-V2’s performance advantage becomes increasingly pronounced as the number of atoms (N) grows, highlighting its efficacy in handling large-scale atomic structures previously considered computationally prohibitive. The work establishes that large equivariant transformers can be trained efficiently using widely accessible GPU platforms, opening new avenues for research in areas like drug discovery and materials science.
Furthermore, the research demonstrates a ∼6× speedup during the convolution stage through EAAS, and a 20× improvement in TFLOPS with the On-the-Fly Equivariant Attention kernel. This combination of techniques not only enhances computational efficiency but also reduces memory requirements, making it possible to simulate larger and more complex systems than previously feasible. The code for E2Former-V2 is publicly available, facilitating further research and development in the field of equivariant machine learning for atomistic modeling, and is accessible at https://github. The research team addressed limitations inherent in existing methods that rely on explicit construction of geometric features or dense tensor products calculated on every edge of a graph. This kernel eliminates the need to materialize edge tensors, thereby reducing activation memory from O(|E|) to O(|V|) and maximizing SRAM utilization. Experiments employing this kernel demonstrated a remarkable 20 improvement in TFLOPS compared to standard implementations, signifying a substantial leap in computational efficiency. The team quantified the impact of their design through observational analysis, comparing traditional EGNNs with standard attention mechanisms and revealing severe latency issues in the former.
FlashAttention, a streaming, tile-based attention mechanism, was used as a benchmark, consistently improving performance over traditional EGNNs across varying system sizes. However, the advantage of E2Former-V2 became increasingly pronounced as the number of atoms (N) grew, highlighting its efficacy in handling large-scale atomic structures. The work demonstrated that E2Former-V2 achieves O(|V|) activation memory while maintaining theoretical exactness, a feat enabled by the combination of EAAS and the custom attention kernel. Extensive experiments conducted on the SPICE and OMol25 datasets confirmed that E2Former-V2 maintains comparable predictive performance while notably accelerating inference, proving its potential for efficient training on widely accessible GPU platforms.
E2Former-V2 accelerates 3D atomistic modelling significantly
Scientists have developed E2Former-V2, a new scalable architecture for modelling 3D atomistic systems using equivariant graph neural networks. This innovation addresses critical scalability bottlenecks present in existing methods, which often rely on explicit construction of geometric features or dense tensor products on every edge. Extensive experiments conducted on the SPICE and OMol25 datasets demonstrate that E2Former-V2 maintains comparable predictive performance to existing models while significantly accelerating inference, achieving a 20x improvement in TFLOPS.
This work establishes that large equivariant neural networks can be efficiently trained using widely accessible GPU platforms, opening avenues for more complex and accurate simulations of molecular dynamics. The authors acknowledge a limitation in that the current implementation is tailored for GPU acceleration, and further work could explore broader hardware compatibility. Future research directions include extending the method to even larger systems and investigating its application to diverse scientific domains beyond molecular modelling.
👉 More information
🗞 E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory
🧠 ArXiv: https://arxiv.org/abs/2601.16622
