Pacific Northwest National Laboratory develops AI cheminformatics tool CACTUS

Researchers at Pacific Northwest National Laboratory have developed an artificial intelligence agent called CACTUS, which integrates cheminformatics tools with large language models to enable molecular discovery. Led by Chief Data Scientist Neeraj Kumar, the team created an adaptable tool for researchers engaged in chemistry and molecular discovery. CACTUS can assist researchers in designing new molecules by predicting molecular properties, assessing drug-likeness, and identifying potential off-target effects.

According to Andrew McNaughton, first author of the research paper published in ACS Omega, large language models alone do not have the knowledge or reasoning skills to answer complex questions, but CACTUS can interpret questions and select the best tool to provide accurate answers. The development of CACTUS has the potential to accelerate scientific advancement and unlock new frontiers in the exploration of novel therapeutic candidates, catalysts, and materials, with key contributors including Rohith Varikoti and Carter Knutson.

Introduction to CACTUS: A Chemistry Agent for Autonomous Science

The development of CACTUS, or the Chemistry Agent Connecting Tool Usage to Science, represents a notable advancement in the field of cheminformatics. Researchers at Pacific Northwest National Laboratory (PNNL) have created an AI-powered agent that integrates large language models with domain-specific tools to facilitate molecular discovery. This innovation has the potential to accelerate scientific progress and unlock new frontiers in the exploration of novel therapeutic candidates, catalysts, and materials. By combining the strengths of open-source language models with cheminformatics tools, CACTUS provides an adaptable platform for researchers engaged in chemistry and molecular discovery.

The initial concept behind CACTUS was to develop a large language model-based cheminformatics assistant that could answer questions about molecules, such as their hydrogen bonding capabilities or toxicity. The team ensured that CACTUS could run on consumer-grade hardware as well as supercomputers, making it accessible to researchers with limited computational resources. This approach enables the acceleration of science by democratizing access to computational chemistry and modeling tools. By embracing open-source models and tools, agents like CACTUS make scientific discovery more accessible to a broader range of researchers.

CACTUS acts as an agent that interfaces with existing computational chemistry tools to provide answers to user queries. It interprets questions, determines the most suitable tool to answer them, formats the question into the correct input for the tool, and then provides the tool’s output back to the user. This approach allows CACTUS to leverage the strengths of both large language models and domain-specific tools, providing a more comprehensive and accurate response to user queries. The development of CACTUS is supported by the I3T investment under the Laboratory Directed Research and Development program at PNNL, as well as the Exascale Computing project under the Department of Energy, Office of Science, Advanced Scientific Computing Research program.

The potential applications of CACTUS are diverse and far-reaching. In drug discovery, CACTUS can aid researchers by predicting molecular properties, assessing drug-likeness, and identifying potential off-target effects. This can accelerate the identification of promising compounds, reducing the time and cost associated with traditional drug discovery methods. By integrating CACTUS with automated experimentation platforms, researchers can design and prioritize experiments, analyze results, and iteratively refine their hypotheses, leading to more efficient and targeted exploration of chemical space.

Enhanced Reasoning with Chemistry Tools

CACTUS’s ability to integrate large language models with domain-specific tools enables enhanced reasoning capabilities in chemistry. The agent can interpret user queries and determine the most suitable tool to answer them, providing a more comprehensive and accurate response. This approach allows CACTUS to leverage the strengths of both large language models and cheminformatics tools, making it a powerful platform for molecular discovery. The use of open-source language models and tools also ensures that CACTUS is accessible to researchers with limited computational resources, democratizing access to computational chemistry and modeling tools.

The development of CACTUS has been supported by the Department of Energy, Office of Science, Advanced Scientific Computing Research program, as well as the I3T investment under the Laboratory Directed Research and Development program at PNNL. The initial concept of integrating large language models with cheminformatics tools received support from the Exascale Computing project, highlighting the potential of CACTUS to accelerate scientific progress in chemistry and materials science. By providing a platform for researchers to design and prioritize experiments, analyze results, and iteratively refine their hypotheses, CACTUS has the potential to transform the discovery process in drug development and materials science.

The integration of CACTUS with automated experimentation platforms enables real-time data-driven decisions, streamlining the discovery process and paving the way for fully autonomous laboratories. This approach can accelerate the identification of promising compounds, reducing the time and cost associated with traditional drug discovery methods. By leveraging the strengths of both large language models and domain-specific tools, CACTUS provides a powerful platform for molecular discovery, enabling researchers to explore vast chemical spaces and predict the performance of novel compounds.

Demonstrating the Utility of CACTUS

The PNNL team has provided examples of how CACTUS can enhance both drug discovery and materials research. In drug discovery, CACTUS aids researchers by predicting molecular properties, assessing drug-likeness, and identifying potential off-target effects. This can accelerate the identification of promising compounds, reducing the time and cost associated with traditional drug discovery methods. By integrating CACTUS with automated experimentation platforms, researchers can design and prioritize experiments, analyze results, and iteratively refine their hypotheses, leading to more efficient and targeted exploration of chemical space.

In materials science, CACTUS can facilitate the discovery of new materials with desired properties by exploring vast chemical spaces and predicting the performance of novel compounds. Integrating CACTUS with automated experimentation platforms will ultimately allow the agent to make data-driven decisions in real time, opening up new possibilities for autonomous discovery. The potential applications of CACTUS are diverse and far-reaching, with implications for fields such as energy, environment, and healthcare.

The development of CACTUS is an ongoing process, with plans to connect the agent with additional tools to create a complete pipeline focused on small molecule discovery. By building upon the capabilities of CACTUS, researchers can accelerate the discovery process in drug development and materials science, paving the way for fully autonomous laboratories. The integration of CACTUS with automated experimentation platforms enables real-time data-driven decisions, streamlining the discovery process and transforming the field of chemistry.

Future Directions for CACTUS

The future of autonomous molecular discovery in drug development and materials science is poised for significant transformation with the development of CACTUS. By integrating large language models with domain-specific tools, CACTUS provides a powerful platform for molecular discovery, enabling researchers to explore vast chemical spaces and predict the performance of novel compounds. The potential applications of CACTUS are diverse and far-reaching, with implications for fields such as energy, environment, and healthcare.

The development of CACTUS is an ongoing process, with plans to connect the agent with additional tools to create a complete pipeline focused on small molecule discovery. By building upon the capabilities of CACTUS, researchers can accelerate the discovery process in drug development and materials science, paving the way for fully autonomous laboratories. The integration of CACTUS with automated experimentation platforms enables real-time data-driven decisions, streamlining the discovery process and transforming the field of chemistry.

The potential of CACTUS to accelerate scientific progress in chemistry and materials science is significant. By providing a platform for researchers to design and prioritize experiments, analyze results, and iteratively refine their hypotheses, CACTUS can accelerate the identification of promising compounds, reducing the time and cost associated with traditional drug discovery methods. The development of CACTUS is supported by the I3T investment under the Laboratory Directed Research and Development program at PNNL, as well as the Exascale Computing project under the Department of Energy, Office of Science, Advanced Scientific Computing Research program.

Conclusion

In conclusion, CACTUS represents a notable advancement in the field of cheminformatics, providing a powerful platform for molecular discovery. By integrating large language models with domain-specific tools, CACTUS enables enhanced reasoning capabilities in chemistry, facilitating the discovery of new materials and therapeutic candidates. The potential applications of CACTUS are diverse and far-reaching, with implications for fields such as energy, environment, and healthcare. As the development of CACTUS continues, it is likely to have a significant impact on the field of chemistry, accelerating scientific progress and transforming the discovery process in drug development and materials science.

More information
External Link: Click Here For More
Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

From Big Bang to AI, Unified Dynamics Enables Understanding of Complex Systems

From Big Bang to AI, Unified Dynamics Enables Understanding of Complex Systems

December 20, 2025
Xanadu Fault Tolerant Quantum Algorithms For Cancer Therapy

Xanadu Fault Tolerant Quantum Algorithms For Cancer Therapy

December 20, 2025
NIST Research Opens Path for Molecular Quantum Technologies

NIST Research Opens Path for Molecular Quantum Technologies

December 20, 2025