Scientists are increasingly reliant on neural emulators to approximate solutions to complex physical systems, but ensuring these models genuinely capture underlying symmetries remains a significant challenge. James Amarel, Robyn Miller, and Nicolas Hengartner, all from Los Alamos National Laboratory, alongside colleagues Migliori, Casleton, Skurikhin et al., have developed a novel diagnostic to assess how well neural networks internalise physical symmetries, going beyond simple tests of forward-pass behaviour. Their research introduces an ‘influence-based’ method that measures how parameter updates propagate between symmetry-related states, effectively probing the local geometry of the loss landscape. By applying this technique to fluid flow emulators, the team demonstrate that ‘gradient coherence’ is crucial for learning to generalise across symmetry transformations and identify basins compatible with those symmetries, offering a powerful new way to evaluate surrogate models and confirm they’ve truly learned the physics.
This innovative approach goes beyond simple tests of forward-pass equivariance, directly examining the learning dynamics to determine if the model is genuinely learning the underlying physics or merely memorizing patterns in the training data. The study centres on understanding how neural networks learn to represent physical systems governed by equations like the Navier-Stokes equations, which exhibit symmetries such as translations, rotations, and reflections. These symmetries define equivalence classes, or “orbits”, of solutions, meaning that physically equivalent states should be treated similarly by a well-trained emulator.
The researchers hypothesized that a model truly internalizing the solution operator would propagate information seamlessly across these orbits, evidenced by aligned gradients of the loss function when evaluated on symmetry-related inputs. This alignment indicates that training updates constructively influence each other, preventing the decoupling of physically equivalent configurations, a key indicator of robust generalization. Measuring this cross-influence, therefore, provides a diagnostic tool exceeding standard forward-pass checks, revealing the extent to which training updates are physically consistent. This work contributes to three key areas of machine learning research: interpretability, generalization theory, and scientific machine learning.
By leveraging influence functions, the team advances interpretability methods, probing training dynamics beyond mere output analysis. Furthermore, the research frames symmetry learning as a problem of basin selection within the loss landscape, governed by orbit-wise gradient coherence, thus informing generalization theory. Within the realm of scientific machine learning, this provides a principled diagnostic for evaluating whether neural emulators have genuinely learned the symmetries inherent in the solution operator. This innovative approach offers a concise framework for evaluating symmetry-consistent behaviour via both forward-pass consistency and probes of the learning dynamics, providing a powerful tool for assessing and improving the generalization capabilities of neural emulators.
Gradient Coherence Reveals Symmetry Learning in Emulators
The study pioneered a method for quantifying a model’s ability to generalize across symmetry orbits by framing symmetry learning as a problem of basin selection within the loss landscape. Researchers engineered a symmetry-aware gradient diagnostic that leverages influence functions to directly probe training dynamics, extending previous explainability frameworks and going beyond analyses of mere forward-pass behaviour. This approach employs the Lie derivative of the cost function along gradient directions induced by individual test examples, expressed as Vμ = −χμν∂νCx, to define a vector field representing the influence of each example on the loss. To broaden the scope, the analysis extended to models trained on velocity fields generated by the Navier-Stokes equations, using NS-BB, NS-Gauss, and NS-Sines initial conditions, representing a qualitatively different feature space. Models were trained in distributed mode on two 40GB A100 GPUs using Lux. jl and Zygote. jl, with three seeds controlling initialization and dataset splits, and results are reported with quantile range bars to capture variability.
Gradient Coherence Reveals Symmetry Internalization in Emulators
Scientists achieved a novel method for evaluating symmetry internalization in neural emulators of partial differential equation solution operators. Experiments revealed that the proposed orbit-wise gradient coherence is a local property of the trained model’s loss landscape, offering a new technique to determine if surrogate models have internalized the symmetry properties of the known solution operator. Results demonstrate that disrespect of symmetry can manifest not only in representation space but also in the local geometry, potentially hindering a coherent update structure across symmetry-related inputs. Each state snapshot comprised a 128 × 128 grid of mass density, Cartesian momentum density, and energy density.
Data shows that despite having fewer parameters, the ViTs consistently outperformed the UNets on test metrics, suggesting a more efficient learning process. Measurements confirm the influence function, expressed as the Lie derivative of the cost along gradient directions, effectively quantifies whether a gradient update induced by one example decreases or increases the loss of another. When applied to symmetry-related inputs, it measures if learning signals propagate coherently along symmetry orbits, with the neural tangent kernel serving as a Fisher-information analog. Researchers evaluated the influence function across six test mini-batches, standardizing the resulting influence matrices by normalizing with respect to empirical variance, establishing unity as the baseline for unstructured stochastic variability, deviations indicating influence beyond random noise.
Symmetry learning linked to gradient coherence improves generalization
The findings establish that test-time equivariance error is linked to how training dynamics distribute influence across symmetry orbits, revealing a distinction between models that genuinely learn shared structure and those that simply assemble local estimators. While symmetry-agnostic architectures can achieve low error, they may allocate influence unevenly, leading to accurate interpolation without internalising the underlying physics, a situation where high predictive accuracy doesn’t guarantee robustness under symmetry transformations. Conversely, manifestly equivariant layers, like those in UNets, promote data efficiency and generalisation through uniform gradient coupling, though this can sometimes slow down optimisation. The authors acknowledge that their diagnostics characterise symmetry learning relative to the training distribution, meaning biases in data representation can influence the measured coherence.
They also note that transformers, while scalable, can converge rapidly with specialised gradients that sacrifice faithful symmetry representation. Future research should focus on developing approximate or relaxed symmetry mechanisms that balance the need for generalisation with the flexibility required for efficient optimisation, potentially combining the strengths of transformers and equivariant modelling. This work offers a valuable diagnostic for building trust in scientific machine learning systems, particularly where robustness under symmetry is paramount.
👉 More information
🗞 Loss Landscape Geometry and the Learning of Symmetries: Or, What Influence Functions Reveal About Robust Generalization
🧠 ArXiv: https://arxiv.org/abs/2601.20172
