Large-scale machine-learning methods have shown a surprising ability to forecast chaotic systems beyond typical predictability horizons. These methods, such as transformers or recurrent neural networks, outperform specialized methods grounded in dynamical systems theory, like reservoir computers or neural ordinary differential equations, especially when there is a lot of data available. However, in data-limited settings, physics-based hybrid methods retain an advantage due to their strong inductive biases. The study, conducted by William Gilpin (The University of Texas at Austin, Austin, Texas) also found that the Lyapunov exponent, a measure of chaos, does not correlate with the accuracy of different forecasting methods.
ML Physics-Based Methods
Overview of the Study
The study focuses on the comparison of large domain-agnostic models like Transformers and LSTM with physics methods such as reservoir computers and neural ODE in forecasting chaos. The research found that with sufficient training history, domain-agnostic models outperform physics methods, a phenomenon known as the “bitter lesson”. However, when the training history is limited, the inductive biases of physics-based models prove superior. The study also found that the Lyapunov exponent, a measure of the rate of separation of infinitesimally close trajectories, does not correlate with the performance of different methods on various chaotic systems.
The research paper highlights the importance of the bias-variance tradeoff. If the system being forecasted has strict constraints from domain knowledge, incorporating them into the model can save computational resources, require less data, and reduce the need for hyperparameter tuning. However, if there is no prior knowledge about the time series and there is a large amount of data available, new machine learning models can perform exceptionally well if tuned carefully.
Chaos and Unpredictability
Traditionally, chaos and unpredictability are considered synonymous. However, large-scale machine learning methods have recently shown a surprising ability to forecast chaotic systems well beyond typical predictability horizons. The study performed a comparative analysis of 24 state-of-the-art forecasting methods on a database of 135 low-dimensional systems with 17 forecast metrics. The results showed that large-scale, domain-agnostic forecasting methods consistently produce accurate predictions up to two dozen Lyapunov times, far beyond the reach of classical methods.
The Butterfly Effect and Chaos
Chaos traditionally implies the butterfly effect, where a small change in a system grows exponentially over time, complicating efforts to reliably forecast the system’s long-term evolution. This represents a longstanding problem at the interface of physics and computer science. Recent successes in statistical forecasting have motivated revisiting this problem, providing compelling examples of data-driven prediction of diverse systems such as cellular signaling pathways, hourly precipitation forecasts, active nematics, and tokamak plasma disruptions.

Physics-Based Models vs Domain-Agnostic Models
There is little consensus on whether the practical success of emerging forecasting methods stems from fundamental advances in representing and parametrizing chaos, or simply from the availability of larger datasets, model capacities, and computational resources. Recent fundamental advances in representing chaos include works demonstrating that chaotic systems appear more linear when lifted to higher-dimensional representations. These works partly explain the recent emergence of reservoir computers as strong forecasting methods for dynamical systems.
The study found that when sufficient training data are available, large, domain-agnostic forecasting models outperform physics-based models at both short and long forecasting horizons. However, when limitations are imposed on computational resources or data availability, models with inductive biases, particularly reservoir computers, perform more strongly. The study suggests that scale and dataset availability, rather than intrinsic dynamical properties, limit the current ability of large models to forecast chaos.
“I found that large domain-agnostic models (Transformers, LSTM, etc) forecast chaos really far into the future (~20 Lyapunov times). With enough training history, they outperform physics methods (reservoir computers, neural ODE, etc). That model scale + large amounts of data outperform domain-specific inductive biases is known as the “bitter lesson”, and it’s a theme of recent ML works.”
William Gilpin
Department of Physics, The University of Texas at Austin, Austin, Texas
However, if we restrict the training history, inductive biases of physics-based models win out. Bigger models also take longer to train, but do better overall. Weirdly, the Lyapunov exponent doesn’t correlate empirically with how well different methods perform on different chaotic systems, especially over longer forecasting horizons.
The biggest takeaway of my paper is that the bias-variance tradeoff rules. If you’re forecasting a system where you have strict constraints from domain knowledge (Hamiltonian, etc), you can put them into the model—you’ll save yourself compute, need less data, & have less hyperparameter tuning. But if you know nothing a priori about your time series, and you have a lot of data, new ML models work really well if you tune them carefully. >10 Lyapunov time forecasts would have seemed crazy a few decades ago
“Chaos and unpredictability are traditionally synonymous, yet large-scale machine-learning methods recently have demonstrated a surprising ability to forecast chaotic systems well beyond typical predictability horizons.”
“However, there is little consensus whether the practical success of emerging forecasting methods stems from fundamental advances in representing and parametrizing chaos, or simply from the availability of larger datasets, model capacities, and computational resources.”
“When sufficient training data are available, we find that large, domain-agnostic forecasting models outperform physics-based models at both short and long forecasting horizons. However, when limitations are imposed on computational resources or data availability, models with inductive biases, particularly reservoir computers, perform more strongly.”
“We find that invariant properties of the underlying dynamical systems only weakly correlate with the ability of the best-performing forecast models to forecast them, suggesting that scale and dataset availability, rather than intrinsic dynamical properties, limit the current ability of large models to forecast chaos.”
Quick Summary
Large-scale machine-learning methods have shown a surprising ability to forecast chaotic systems well beyond typical predictability horizons, outperforming physics-based models when sufficient training data is available. However, when data is limited, physics-based hybrid methods, which incorporate domain knowledge, retain a comparative advantage due to their strong inductive biases.
- Large domain-agnostic models like Transformers and LSTM can predict chaotic systems far into the future, outperforming physics methods like reservoir computers and neural ODE when given enough training history. This is known as the “bitter lesson”.
- However, if training history is limited, physics-based models with inductive biases perform better. Larger models also take longer to train but perform better overall.
- The Lyapunov exponent, a measure of the rate at which information about the initial conditions of a system is lost, does not correlate with how well different methods perform on different chaotic systems, especially over longer forecasting horizons.
- The main takeaway is the importance of the bias-variance tradeoff. If you’re forecasting a system where you have strict constraints from domain knowledge, you can incorporate them into the model to save compute, need less data, and have less hyperparameter tuning.
- If you know nothing about your time series and have a lot of data, new machine-learning models work well if you tune them carefully.
- The study found that large-scale, domain-agnostic forecasting methods consistently produce accurate predictions up to two dozen Lyapunov times, beyond the reach of classical methods.
- In data-limited settings, physics-based hybrid methods retain a comparative advantage due to their strong inductive biases.
- The study was conducted by William Gilpin and published by the American Physical Society.

