Diffusion Models Recover Geological Parameters, Improving Carbon Capture Simulations by 25%

Scientists are tackling the significant challenge of accurately modelling subsurface fluid flow for Carbon Capture and Storage (CCS) using a novel approach detailed in new research led by Xin Ju, Jiachen Yao from the California Institute of Technology, and Anima Anandkumar also of the California Institute of Technology, working with Sally M. Benson from Stanford University, Gege Wen from Imperial College London, and colleagues. This study introduces Fun-DDPS, a generative framework combining function-space diffusion models with differentiable neural operator surrogates to improve both forward and inverse modelling capabilities. The research is particularly significant as it addresses the ill-posed nature of inverse problems in CCS, often hampered by limited observational data, and demonstrates a substantial improvement in accuracy, an eleven-fold reduction in relative error with sparse data, compared to traditional methods, alongside rigorous validation against established statistical techniques like Rejection Sampling.

Accurate characterisation of what happens underground is vital for safely and effectively storing carbon dioxide, but this process is hampered by limited observational data and the complex nature of geological formations.

This research introduces a generative approach that combines function-space diffusion models with differentiable neural operator surrogates, offering a significant advancement over existing methods. Fun-DDPS learns a realistic range of possible geological configurations, then efficiently predicts how fluids will flow through them, even with sparse data.
The core innovation lies in decoupling the learning of geological characteristics from the physics governing fluid flow. Traditional methods often struggle with this separation, leading to unrealistic or inconsistent predictions, particularly when data is scarce. By training a diffusion model to understand the likely range of geological parameters and a separate neural operator to simulate the physics, Fun-DDPS achieves a remarkable 11x improvement in forward modelling accuracy when using only 25% of the typical observational data.

This capability is particularly important for large-scale CCS projects where gathering comprehensive data is expensive and challenging. Furthermore, this work provides the first rigorous validation of diffusion-based inverse solvers against a gold standard technique called Rejection Sampling, demonstrating high statistical accuracy with a Jensen-Shannon divergence below 0.06.

Importantly, the generated geological models are physically consistent and avoid the artificial distortions often seen in other approaches, while also requiring four times less computational effort than Rejection Sampling. This advancement promises to unlock more reliable and efficient methods for monitoring and managing subsurface carbon storage, paving the way for wider deployment of this critical climate technology.

Diffusion modelling learns geological priors and accelerates subsurface flow simulations

A single-channel function-space diffusion model underpins the research, employed to learn a prior distribution over geological parameters, specifically, the geomodel itself. This diffusion model operates by progressively adding noise to training data until it becomes pure noise, then learning to reverse this process to generate new, realistic geomodel samples.

Crucially, this approach avoids the limitations of traditional methods that struggle with high-dimensional, non-Gaussian subsurface parameters. The choice of a function-space diffusion model, rather than pixel-based image generation techniques, allows the research to directly model the continuous fields representing geological properties, preserving their inherent characteristics.

To efficiently guide the diffusion process during inverse modelling, a Local Neural Operator (LNO) surrogate was independently trained to approximate the forward physics governing carbon dioxide plume migration. Neural operators represent a class of deep learning models designed to map between function spaces, enabling rapid prediction of dynamic states given a geological parameter field.

The LNO was selected for its ability to capture local dependencies within the flow field, improving accuracy and computational efficiency. This decoupling of prior learning and forward modelling is a key methodological innovation, allowing the diffusion model to robustly recover missing information while the surrogate provides gradient-based guidance for data assimilation.

During inference, sparse observational data of the dynamics field is used to compute gradients which are then backpropagated through the LNO and into the diffusion model. This process effectively steers the generation of geological parameters towards solutions consistent with the observed data. The entire framework was tested using synthetic CCS modelling datasets, designed to mimic the complexities of real-world subsurface environments. This synthetic approach allows for rigorous control over ground truth parameters and facilitates quantitative evaluation of the method’s performance.

Fun-DDPS demonstrates high accuracy and efficiency in limited-data forward modelling

Forward modelling utilising Fun-DDPS yielded a relative error of 7.7% when operating with only 25% of the expected observational data. This contrasts sharply with standard surrogate models, which exhibited a relative error of 86.9% under the same conditions, demonstrating an eleven-fold improvement in accuracy.

This substantial reduction in error confirms the system’s ability to function effectively even with severely limited data, a critical capability where deterministic methods typically fail. The research further establishes a Jensen-Shannon divergence of less than 0.06 between Fun-DDPS and Fun-DPS posteriors and the ground truth, signifying a high degree of statistical accuracy in both approaches.

Notably, Fun-DDPS generated physically consistent realizations, avoiding the high-frequency artifacts frequently observed in joint-state baseline models. This consistency is achieved alongside a four-fold increase in sample efficiency when compared to Rejection Sampling, indicating a more streamlined and resource-conscious process.

Both Fun-DDPS and the joint-state baseline, Fun-DPS, demonstrated comparable performance in approximating the true posterior distribution, as quantified by the low Jensen-Shannon divergence values. The decoupling of geological prior learning from forward physics approximation within Fun-DDPS appears to be key to generating realistic subsurface models.

The study rigorously validated diffusion-based inverse solvers against asymptotically exact Rejection Sampling posteriors, a first for this type of research. This validation process confirms the reliability and accuracy of the diffusion-based approach in characterising subsurface geological heterogeneity. The ability to accurately reconstruct subsurface parameters from sparse observational data is paramount for effective carbon capture and storage, and these results demonstrate a significant step forward in achieving that goal.

The Bigger Picture

The persistent challenge of accurately modelling subsurface environments has long hampered effective carbon capture and storage. For decades, the difficulty lay not in the physics itself, but in the sheer lack of data available to constrain those models. Geological formations are notoriously opaque, and direct observation is limited and expensive.

This work represents a significant step forward by embracing a fundamentally different approach: instead of striving for a single ‘correct’ answer, the researchers have developed a system that generates a distribution of plausible subsurface scenarios. This generative framework, Fun-DDPS, cleverly combines the power of diffusion models, previously prominent in image generation, with neural network surrogates to efficiently simulate fluid flow.

The substantial improvement in forward modelling accuracy, particularly with limited observational data, is noteworthy. It’s not simply about achieving lower error rates; it’s about building confidence in predictions when faced with inherent uncertainty. The validation against rejection sampling, a gold standard for probabilistic inference, lends further credibility to the approach.

However, the reliance on synthetic datasets remains a limitation. Real-world geological formations are far more complex and heterogeneous than anything currently simulated. The statistical hyperparameters used to generate these synthetic fields, while varied, may not fully capture the breadth of geological possibilities.

Furthermore, the computational cost, even with a powerful GPU, suggests scalability could be an issue for truly large-scale CCS projects. Looking ahead, the most exciting prospect is the integration of this generative approach with real-world data streams, seismic surveys, well logs, and even microseismic monitoring.

Combining these observations with the diffusion prior could unlock a new era of dynamic reservoir management, allowing operators to adapt injection strategies in real-time and maximise storage capacity. The potential extends beyond CCS, too, offering a powerful new tool for groundwater modelling, geothermal energy exploration, and even predicting volcanic activity.

👉 More information
🗞 Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage
🧠 ArXiv: https://arxiv.org/abs/2602.12274

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Control Methods Gain Stability Against Hardware Errors with New Optimisation Technique

Mathematical Analysis Confirms a Long-Standing Conjecture About Special Function Values

February 14, 2026
Quantum Architecture Shrinks Computing Needs to under 100 000 Qubits

Machine Learning Now Personalises Treatment Effects from Complex, Continuous Data

February 14, 2026
Researchers Develop Systems Equating 2 Diagram Classes with Verified Term Rewriting Rules

Researchers Develop Systems Equating 2 Diagram Classes with Verified Term Rewriting Rules

February 14, 2026