The pursuit of scientific discovery traditionally blends observation, analysis, and the formulation of new hypotheses, but researchers are now exploring how machine learning can accelerate this process. Maximilian Nägele and Florian Marquardt, from the Max Planck Institute for the Science of Light and Friedrich-Alexander Universität Erlangen-Nürnberg, alongside their colleagues, present a new approach to fully automating scientific exploration. They introduce SciExplorer, an agent that utilises the capabilities of large language models to investigate physical systems without relying on pre-programmed instructions or task-specific blueprints. This innovative agent successfully explores a diverse range of models, encompassing mechanical dynamics, wave evolution, and quantum physics, and demonstrates an impressive ability to recover fundamental equations and infer key properties from observed data, paving the way for automated scientific discovery across multiple disciplines.
Agentic Framework for Physics Model Exploration
Agentic exploration of physics models offers a new approach to scientific discovery, moving beyond automating specific tasks towards genuine scientific agency. This work introduces a framework centred on an agent that iteratively proposes experiments, interprets results, and refines its understanding of a given model. The agent operates within a closed-loop system, interacting with a simulated environment representing the physics model under investigation, and employs Gaussian process regression to build a predictive model, enabling efficient exploration and prediction of experimental outcomes. Crucially, the agent incorporates an intrinsic motivation mechanism, rewarding it for reducing uncertainty in its predictions, rather than for achieving specific task goals.
This intrinsic motivation drives the agent to actively seek out informative experiments, even in the absence of external rewards. The team demonstrates the framework’s effectiveness in a simplified model system, showing that the agent can autonomously discover non-trivial relationships between model parameters and observable quantities, successfully identifying a hidden parameter governing the system’s behaviour with 99. 7% accuracy. This represents a significant step towards creating artificial scientists capable of independent scientific inquiry and potentially accelerating the pace of discovery in complex systems.
Automating the open-ended, iterative loop required to discover the laws of an unknown system through experimentation and analysis remains a significant challenge. Here, the team introduces SciExplorer, an agent that leverages large language model tool-use capabilities to enable free-form exploration of systems without domain-specific blueprints. They applied SciExplorer to the exploration of physical systems initially unknown to the agent, testing it on a broad set of models spanning mechanical dynamical systems, wave evolution, and quantum many-body physics, and demonstrate promising results in automated scientific exploration.
LLMs Accelerate Discovery Across Scientific Fields
Recent research details a surge in the application of Large Language Models (LLMs), such as GPT-5 and Gemini, to accelerate scientific discovery across diverse fields including chemistry, materials science, biology, physics, and engineering. LLMs are being used to predict chemical properties and reactions, design new materials with specific properties, and simulate materials behaviour. In biology and biomedicine, LLMs are applied to biomedical text mining, protein structure prediction, and drug discovery, while within physics, they are used for discovering governing equations from data, analysing complex physical systems, and automating simulations. A prominent trend is building agentic systems where LLMs are coupled with tools, such as simulators and databases, and given autonomy to plan and execute scientific tasks, allowing for closed-loop experimentation and discovery.
While general-purpose LLMs are useful, fine-tuning them on domain-specific data significantly improves performance, as demonstrated by specialized models like BioBERT and BioGPT. Several frameworks, including SciExplorer, CLAPP, and an Open Source Planning and Control System, are being developed to streamline the integration of LLMs into scientific workflows. JAX, a high-performance numerical computation library, is used for building scientific simulations and models, and many of these projects are open-source, with code available on platforms like GitHub, fostering collaboration and reproducibility. However, challenges remain, including data availability and quality, ensuring reproducibility and validation, and improving interpretability and explainability. Scaling and generalization to new problems also present ongoing challenges, as does seamless integration with existing scientific infrastructure. This research highlights a paradigm shift in scientific discovery, where LLMs are becoming increasingly powerful tools for automating tasks, generating hypotheses, and accelerating the pace of innovation.
Autonomous Scientific Discovery Through Agent Exploration
SciExplorer represents a significant step towards automating scientific discovery through the development of an agent capable of independently exploring unknown physical systems. The team successfully demonstrated that this agent, leveraging the capabilities of large language models and code execution, can infer equations of motion and Hamiltonians from observed dynamics without prior knowledge or task-specific instructions. By autonomously generating and executing Python code, SciExplorer extracts qualitative signatures from data, constructs candidate models, and fits their coefficients to observed accelerations, effectively recreating aspects of the scientific process. The agent achieved strong performance across a range of mechanical, dynamical, and wave-based systems, often recovering governing models with high accuracy, as measured by the coefficient of determination between predicted and actual dynamics. While the system excels at identifying systems similar to those within its existing knowledge base, performance diminishes when confronted with entirely novel or complex scenarios, highlighting a reliance on pre-existing knowledge. The authors acknowledge that common failure modes include premature commitment to incorrect models and a limited ability to reconsider initial assumptions when faced with poor fits, and future work may focus on improving the agent’s capacity for self-correction and expanding its ability to explore genuinely uncharted scientific territory.
👉 More information
🗞 Agentic Exploration of Physics Models
🧠 ArXiv: https://arxiv.org/abs/2509.24978
