Researchers developed ATAG, a framework utilising attack graphs to assess security risks in multi-agent systems powered by large language models. ATAG incorporates a new vulnerability database and accurately models complex, multi-step attacks including prompt injection and sensitive data disclosure across interconnected agents.
The increasing prevalence of multi-agent systems (MASs), where autonomous entities powered by large language models (LLMs) collaborate to achieve complex goals, presents novel security challenges. Assessing vulnerabilities in these interconnected systems requires methods capable of modelling both traditional cyber threats and those specific to LLMs, such as prompt injection and data leakage. Researchers at Ben-Gurion University of the Negev – Parth Atulbhai Gandhi, Akansha Shukla, David Tayouri, Beni Ifland, Yuval Elovici, Rami Puzis, and Asaf Shabtai – address this need in their paper, “ATAG: AI-Agent Application Threat Assessment with Attack Graphs”. They present a new framework, ATAG, which extends existing attack graph methodologies to systematically analyse and visualise potential attack paths within AI-agent applications, accompanied by a newly created Large Language Model Vulnerability Database (LVD) to standardise documentation of LLM weaknesses.
Assessing and Enhancing Security in Multi-Agent AI Systems
Multi-agent AI systems (MAAS), comprising interconnected autonomous agents often powered by large language models (LLMs), present novel security challenges. Traditional security assessment methodologies struggle to address the unique vulnerabilities introduced by these complex, interacting systems. This research introduces ATAG, a framework designed to systematically assess security risks within MAAS by extending established attack graph methodologies to specifically model LLM vulnerabilities and potential exploitation pathways.
ATAG operates by initially modelling the MAAS architecture, meticulously defining agents, their interactions – including data flow and external connections – and access control mechanisms utilising Hierarchical Access Control Lists (HACL). HACLs refine access permissions beyond simple read/write/execute, allowing granular control based on user roles and data sensitivity. A central component is the LLM Vulnerability Database (LVD), which catalogues known LLM vulnerabilities – such as prompt injection (manipulating the LLM through crafted inputs), data exfiltration (unauthorised extraction of data), and context ignoring (failure to adhere to established guidelines) – categorising them by severity and potential impact on system integrity.
The LVD directly informs the generation of attack graphs. These graphs visually depict potential attack paths an adversary could take to compromise the system, providing a comprehensive view of potential weaknesses. Nodes in the graph represent system components or vulnerabilities, while edges represent the actions an attacker might take to move between them.
Demonstration of ATAG’s efficacy occurs through its application to two multi-agent applications, showcasing its ability to model complex, multi-step attacks exploiting LLM vulnerabilities across interconnected agents. These case studies illustrate how the framework identifies vulnerabilities and maps potential attack vectors, providing security professionals with actionable insights. For example, ATAG can visualise how a compromised agent could leverage prompt injection to manipulate another agent’s behaviour, leading to a cascading failure across the system.
This research contributes a valuable methodology for securing MAAS, moving beyond traditional security assessment approaches. By combining a data-driven approach – utilising the LVD – with automated attack graph generation, ATAG offers a systematic and comprehensive means of identifying and mitigating risks, providing a proactive security solution. The framework’s ability to model LLM-specific vulnerabilities distinguishes it from traditional security assessment tools and positions it as a crucial step towards building more resilient and secure AI systems, ensuring long-term reliability and trust.
A key component of this work is the creation of the LLM Vulnerability Database (LVD), intended to standardise and catalogue known LLM vulnerabilities – such as prompt injection, data exfiltration, and context ignoring – categorising them by severity and potential impact on system integrity. Maintaining an up-to-date LVD is crucial, as new vulnerabilities are continually discovered.
The research highlights the limitations of traditional security assessment approaches when applied to MAAS and demonstrates how ATAG facilitates proactive identification and mitigation of AI-agent threats, fostering trust and confidence in AI systems. Addressing scalability concerns for large systems remains a key area for future development, ensuring the framework remains effective and relevant as AI technology evolves.
👉 More information
🗞 ATAG: AI-Agent Application Threat Assessment with Attack Graphs
🧠 DOI: https://doi.org/10.48550/arXiv.2506.02859
