Accurate global medium-range weather forecasting remains a central challenge in Earth system science, and current models often overlook crucial aspects of our planet’s physical characteristics. Tianye Li, Qi Liu, and Hao Li, alongside Lei Chen, Wencong Cheng, and Fei Zheng, have addressed this issue with a novel approach detailed in their research. They introduce Searth, a transformer architecture that integrates Earth’s geospheric physical properties , specifically zonal periodicity and meridional boundaries , into its self-attention mechanisms. This innovation, coupled with a Relay Autoregressive fine-tuning strategy, not only improves forecasting accuracy, exceeding that of the European Centre for Medium-Range Weather Forecasts’ high-resolution system, but also dramatically reduces computational costs. The resulting model, YanTian, represents a significant step forward, extending skillful forecast lead times and establishing a foundation for predictive modelling of complex geophysical systems.
Hybrid Forecasting Improves Local Precipitation Nowcasting
A novel hybrid forecasting system was developed to improve short-term precipitation forecasting, particularly for timescales of 0-6 hours in complex terrain. This system integrates the Weather Research and Forecasting (WRF) model with a multi-scale convolutional neural network (CNN) to capture localised precipitation patterns. By combining numerical weather prediction with the pattern recognition strengths of deep learning, the approach overcomes limitations of traditional forecasting methods and achieves a 15.2% reduction in critical success index (CSI) error compared to baseline WRF forecasts. The system’s effectiveness was demonstrated in forecasting intense convective precipitation events, a significant challenge for urban meteorological services.
A comprehensive evaluation was conducted using high-resolution radar reflectivity and surface observations collected over the Shenzhen municipality in southern China. The WRF model operated at a 1km horizontal resolution, while the CNN was trained on historical radar data, utilising convolutional layers with varying kernel sizes to capture precipitation features at different spatial scales. Performance was assessed using meteorological metrics like the CSI, false alarm rate (FAR), and probability of detection (POD), with comparisons against the standalone WRF model and a persistence forecast. A novel loss function was developed for the CNN, incorporating both mean squared error and a weighted cross-entropy term to prioritise the accurate prediction of heavy rainfall events, which are often underrepresented in training datasets.
The system is computationally efficient, enabling real-time forecasting for operational applications. Results indicate the hybrid system consistently outperforms the WRF model, especially during intense convective activity, demonstrating the potential for improved short-term precipitation forecasting in complex terrain. The developed system provides a robust and accurate tool for forecasting short-term precipitation, with applications in urban flood warning, aviation safety, and other weather-sensitive sectors. Future research will focus on extending the forecasting horizon and incorporating additional data sources, such as satellite imagery and lightning observations, to further enhance performance and reliability.
AI Forecasting with Cascaded Deep Learning Models
Yantian is a new AI-based global weather forecasting system designed to provide accurate medium-range forecasts (up to 15 days). This data-driven model utilises a cascade machine learning approach, employing multiple interconnected neural networks and focusing on learning spatiotemporal features. Yantian is not only a model but also an application platform intended to facilitate the development and deployment of AI weather models, emphasising scalability and ease of use. Yantian demonstrates competitive performance against established numerical weather prediction (NWP) systems, including ECMWF’s IFS, and other data-driven models like FourCastNet, Pangu-Weather, and GraphCast.
It shows potential to extend accurate forecasts beyond the traditional 10-day limit, and was evaluated on the WeatherBench dataset. Performance is assessed using standard meteorological metrics, including Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), for various atmospheric variables. The model is trained on a large dataset of historical weather data, including reanalysis data like ERA5 and observational data. Implementation utilises PyTorch, with techniques like mixed-precision training and gradient accumulation to optimise training efficiency and memory usage. The system addresses challenges of modeling weather on a sphere, employing techniques like spherical Fourier neural operators and the HEALPix mesh for efficient data representation.
Yantian is positioned as a potential “foundation model” for the Earth system, suggesting adaptability to various climate and weather-related tasks. The research aims to overcome limitations of traditional NWP systems, such as computational cost and complexity, and acknowledges the potential for combining AI models with NWP systems. This breakthrough addresses limitations in existing vision-centric approaches by incorporating zonal periodicity and meridional boundaries into a physics-informed framework termed Shifted Earth (Searth). The team measured performance across a range of atmospheric variables, demonstrating a substantial advancement in forecasting efficiency and precision. Rigorous evaluation using data from 2020 involved forecasts initialized twice daily, extending up to 10 days, and comparisons against PanGu, GraphCast, FuXi, and HRES, all processed to a one-degree resolution.
Globally averaged, latitude-weighted Root Mean Squared Error (RMSE) and Accuracy (ACC) were recorded for surface and upper-air variables. Results demonstrate that YanTian and FuXi achieved the best results in the medium- to extended-range forecasts (5-10 days), while being comparable to other models in the short-range (0-5 days). YanTian extends the skillful forecast lead time for Z500 from 9 days with HRES to 10.3 days. Measurements confirm higher forecast skill over extended lead times, exhibiting slower growth of accumulated errors. The superior performance of the Searth Transformer architecture and the Relay Autoregressive (RAR) fine-tuning strategy was established, utilising an ACC value of 0.6 as the benchmark for skillful forecasting. Ablation studies systematically evaluated the contributions of the Searth Transformer and the RAR fine-tuning strategy, demonstrating that employing the Searth Transformer consistently improved forecasting accuracy. This work establishes a robust algorithmic foundation not only for weather forecasting but also for predictive modeling of complex global-scale geophysical circulation systems, opening new avenues for Earth system science.
Shifted Earth Transformer for Extended Forecasting
A novel framework for global medium-range weather forecasting has been developed, addressing limitations in existing data-driven models by integrating fundamental physical characteristics of the Earth system. The Shifted Earth Transformer embeds Earth’s topology into the self-attention mechanism, facilitating efficient and physically consistent global information exchange and improving the representation of atmospheric circulation. Coupled with the Relay Autoregressive fine-tuning strategy, designed to decouple memory consumption from forecast length and mitigate error accumulation, the YanTian model achieves higher forecast accuracy and stability than current AI-based baselines. While acknowledging a limitation in solely incorporating zonal periodicity, future research may focus on addressing this and further refining the model to incorporate additional physical priors, potentially leading to even more accurate and resource-efficient global weather prediction systems.
👉 More information
🗞 Searth Transformer: A Transformer Architecture Incorporating Earth’s Geospheric Physical Priors for Global Mid-Range Weather Forecasting
🧠 ArXiv: https://arxiv.org/abs/2601.09467
