Scientists are increasingly focused on improving global streamflow forecasting to enhance both flood prediction and sustainable water resource management. Maria Luisa Taccari, Kenza Tazi, and Oisín M. Morrison, working with colleagues at the European Centre for Medium-Range Weather Forecasts (ECMWF) in Reading and Bonn, Germany, present AIFL (for Floods), a novel deterministic LSTM-based model for global daily streamflow forecasting. This research addresses a critical gap in current data-driven models, which often exhibit reduced performance when transitioning from historical data to real-time forecasts. By employing a two-stage training strategy, initial pre-training on 40 years of ERA5-Land reanalysis and subsequent fine-tuning on operational Integrated Forecasting System (IFS) forecasts, AIFL demonstrably bridges this reanalysis-to-forecast domain shift and establishes a new, transparent and reproducible baseline within the CARAVAN ecosystem, achieving high predictive skill and competitive accuracy against existing global systems.

Global flood prediction is now entering a new era of accuracy and dependability. AIFL delivers a single, consistent streamflow forecast for nearly 19,000 river basins worldwide, offering a transparent and readily available tool for better water management and disaster preparedness. Scientists are developing new methods to improve global streamflow forecasting, a capability vital for managing water resources and preparing for floods.

Data-driven models often decline in performance when moved from research settings to real-world operational forecasts, a challenge addressed by a new deterministic model named AIFL, or Artificial Intelligence for Floods, with a unique training approach. Trained on data from 18,588 river basins within the CARAVAN dataset, AIFL aims to bridge the gap between historical simulations and the uncertainties inherent in live weather predictions.

Achieving accurate forecasts across vast geographical areas requires overcoming the limitations of both traditional, physics-based hydrological models and emerging machine learning techniques. Conventional models struggle with representing the complexities of water flow and depend on high-quality climate data, while many machine learning approaches lack the ability to accurately translate from past data to future predictions.

AIFL employs a two-stage training strategy, initially learning from 40 years of ERA5-Land reanalysis data to establish a strong understanding of hydrological processes. Independent testing, using data from 2021 to 2024, reveals high predictive skill, with a median modified Kling-Gupta Efficiency (KGE’) reaching 0.66 and a median Nash-Sutcliffe Efficiency (NSE) of 0.53.

The model demonstrates a particular strength in identifying extreme events, offering a reliable baseline for the global hydrological community. Accurately predicting peak flows is essential for effective flood warning systems, allowing communities time to prepare and mitigate damage. By combining the strengths of both historical data analysis and real-time forecast adaptation, AIFL offers a streamlined and operationally sound approach to global streamflow prediction, potentially enhancing disaster preparedness and water management strategies worldwide, particularly as climate change intensifies hydrological extremes.

Pre-training with reanalysis and fine-tuning with operational forecasts improves streamflow prediction accuracy

A two-stage training strategy underpinned the development of AIFL, a deterministic LSTM-based model for global daily streamflow forecasting. Initially, the model underwent pre-training utilising 40 years of ERA5-Land reanalysis data, spanning from 1980 to 2019, to establish a strong foundation in capturing fundamental hydrological processes. By employing a deterministic LSTM architecture, the work moved away from probabilistic approaches, focusing on generating single, best-estimate streamflow forecasts for each basin.

This methodology bridges the gap between historical reanalysis data and real-time forecast products, unlike many existing global hydrological models that rely on complex, physically-based frameworks. Instead of carefully calibrating parameters to geophysical maps, the LSTM network learns directly from the observed relationships within the training data.

Forcing data, including meteorological variables, was ingested into the model alongside static attributes such as topography and land cover, allowing AIFL to implicitly capture their influence on streamflow without explicit parameterisation. This work represents the first global model trained end-to-end within the CARAVAN ecosystem, providing a streamlined and reproducible forcing pipeline. Prioritising operational robustness and the ability to detect extreme events reliably, the research team aimed to create a baseline model accessible to the wider hydrological community.

Global Streamflow Prediction and Extreme Event Detection with AIFL

Achieving a median modified Kling-Gupta Efficiency (KGE’) of 0.66, AIFL demonstrates substantial predictive skill in global daily streamflow forecasting, with a median Nash-Sutcliffe Efficiency (NSE) of 0.53 across an independent test period spanning 2021 to 2024. These values indicate a strong ability to accurately simulate observed streamflow, with KGE measuring overall similarity and NSE assessing error variance reduction.

In particular, these scores were calculated across 18,588 river basins, highlighting the model’s broad applicability. AIFL’s performance extends beyond overall accuracy, revealing exceptional reliability in detecting extreme events, a critical capability for effective flood preparedness. Operating deterministically, the model provides a consistent and predictable baseline for hydrological forecasting.

The two-stage training strategy successfully bridges the gap between historical reanalysis data and operational forecast products. Benchmarking indicates AIFL is competitive with existing state-of-the-art global systems, offering a transparent and reproducible forcing pipeline for easier validation and adaptation. AIFL’s LSTM-based approach provides a streamlined alternative for global streamflow prediction, rather than relying on complex parameterisation.

Bridging the gap between hydrological simulation and accurate flood prediction

Scientists are edging closer to reliable global flood forecasting, a feat long hampered by the difficulty of translating theoretical model performance into real-world accuracy. For years, hydrological models have excelled at simulating past events but struggled to predict future ones, stemming from discrepancies between training data and imperfect operational forecasting information.

This work presents AIFL, which directly addresses this ‘reanalysis-to-forecast domain shift’ through a two-stage training process. Rather than simply fitting historical data, the developers first established a strong foundation in understanding hydrological processes, then adapted it to the specific biases present in current weather prediction systems.

Achieving predictive skill across an entire planet is a considerable undertaking, and the model’s performance isn’t uniform everywhere. The challenge of forecasting in data-scarce regions remains, as accuracy will naturally vary depending on local data quality and availability. The focus on a transparent and reproducible forcing pipeline is particularly valuable, allowing other researchers to scrutinise and improve upon the methodology.

The implications extend beyond improved forecasts, offering a pathway towards more dependable early warning systems, potentially reducing the devastating impact of floods on vulnerable communities. Unlike previous global models within this ecosystem, this one is built end-to-end, offering a streamlined approach. This technique will likely be applied to other environmental forecasting challenges, such as drought prediction or water resource management. The model provides a strong baseline, but future work should explore incorporating additional data sources, such as satellite imagery and ground-based sensors, to further refine its predictive capabilities.

👉 More information
🗞 AIFL: A Global Daily Streamflow Forecasting Model Using Deterministic LSTM Pre-trained on ERA5-Land and Fine-tuned on IFS
🧠 ArXiv: https://arxiv.org/abs/2602.16579

AI Model Boosts Global Flood Forecasting Accuracy

Pre-training with reanalysis and fine-tuning with operational forecasts improves streamflow prediction accuracy

Global Streamflow Prediction and Extreme Event Detection with AIFL

Bridging the gap between hydrological simulation and accurate flood prediction

Rohail T.

Latest Posts by Rohail T.:

Quantum Circuits Reveal Hidden Entanglement Changes with New Entropy Measures

Plant Light-Harvesting Boosted by Internal Electronic Mixing

Modulated Quantum Batteries Overcome Efficiency Losses from Energy Coherence