The increasing prevalence of large language model (LLM) systems, powering everything from chatbots to autonomous agents, presents a novel and largely unaddressed security challenge. Ben Nassi from Tel Aviv University, alongside Bruce Schneier and Oleg Brodt of Ben-Gurion University of the Negev, detail how attacks on these systems are evolving beyond simple “prompt injection” and are beginning to resemble traditional malware campaigns. Their research introduces the concept of “promptware”, a distinct class of malware, and proposes a five-step “kill chain” model to analyse these threats, encompassing initial access, privilege escalation, persistence, lateral movement and objective execution. This framework demonstrates a systematic pattern to LLM attacks, offering security professionals a crucial methodology for threat modelling and a shared understanding of this rapidly developing landscape. By identifying these parallels to established malware, the authors highlight the urgent need to adapt existing cybersecurity practices to effectively defend against promptware.

Promptware Attacks and the LLM Kill Chain

This research introduces “promptware” as a distinct class of malware targeting large language model (LLM)-based systems, demonstrating that attacks increasingly resemble traditional, multi-step malware campaigns. Researchers moved beyond the simplistic view of “prompt injection” by establishing a five-step kill chain model mirroring traditional cybersecurity practices. This innovative approach enabled a systematic analysis of LLM-based attacks, revealing their multi-stage nature and underlying patterns, and providing analytical clarity and a systematic approach to risk assessment. The research defines Initial Access as encompassing both direct and indirect prompt injection techniques, where malicious payloads enter the LLM’s context window through various modalities.

Subsequent analysis focuses on Privilege Escalation, achieved through jailbreaking methods that bypass safety training to unlock restricted capabilities within the LLM. Scientists then detail Persistence, where attackers establish a durable foothold by corrupting long-term memory components, ensuring continued access and control. Further investigation reveals Lateral Movement, where the attack propagates across users, devices, or connected services, utilizing both on-device and off-device techniques. Finally, the study examines Actions on Objective, representing the attacker’s ultimate goal, ranging from remote code execution and data exfiltration to ransomware deployment.

This detailed kill chain draws heavily from the established Cyber Kill Chain, adapting its structure to the unique characteristics of LLM-based systems. By applying this framework, the work demonstrates that promptware attacks consistently follow systematic, multi-stage sequences, facilitating structured analysis and improved security measures. The team measured the potential for data exfiltration, identifying a “Lethal Trifecta” of conditions , access to sensitive data, exposure to untrusted content, and external communication ability , that, when met, allows for effective payload delivery. Data shows that compromised LLM-powered email assistants can exfiltrate correspondence, while agents with smart home access can directly affect the physical environment.

Researchers recorded instances of persistent exfiltration channels activating on every subsequent inference within ChatGPT, and cross-service data movement from internal repositories to attacker-controlled destinations. These findings highlight the significant risk posed by LLM applications with broad access to data and capabilities. Results demonstrate that promptware can escalate to physical and financial impacts, with experiments showing the ability to control smart home devices through exploitation of integrations with systems like Google Assistant. A trivial attack saw an attacker manipulate a ChatGPT chatbot to agree to sell a vehicle for one dollar, proving the potential for real economic consequences.

More significantly, an attack on a cryptocurrency trading agent resulted in the transfer of approximately 55 ETH to an attacker’s wallet, achieved without exploiting the underlying blockchain. The breakthrough delivers evidence of remote code execution, with research demonstrating exploitation of AI IDE shell tools to achieve this outcome. The study confirms that the severity of actions on objective is directly linked to tool access, permission scope, and automation level. Scientists established that fully autonomous agents, operating without oversight, represent the highest risk configuration, while human-in-the-loop designs offer a degree of mitigation.

This work provides a structured methodology for threat modelling and a common vocabulary for security practitioners to address this rapidly evolving threat landscape. The authors acknowledge that current defences like guardrails and alignment training are susceptible to sophisticated attacks. They present the kill chain as a tool for structured thinking, not a rigid classification, and recognise that attackers may adapt techniques to bypass or combine stages. Future work could explore the development of specific defensive measures targeting each stage of the kill chain, and investigate how the framework can be applied to emerging LLM architectures and attack vectors. The research calls for a shift in how we approach LLM security, moving beyond simple input validation to a more comprehensive, layered defense strategy.

Promptware Kill Chain for LLM Attacks

The study pioneers a new framework for understanding attacks on large language models (LLMs), proposing that these threats constitute a distinct class of malware termed “promptware”. Researchers moved beyond the simplistic view of “prompt injection” by establishing a five-step kill chain model mirroring traditional malware campaigns. This innovative approach enabled a systematic analysis of LLM-based attacks, revealing their multi-stage nature and underlying patterns. The team meticulously mapped recent attacks onto this kill chain, demonstrating the analogous structure between promptware and conventional malware.

Promptware Kill Chain in Large Language Models

Scientists have identified a distinct class of malware, termed “promptware”, targeting large language model (LLM)-based systems and established a five-step “kill chain” to analyse these emerging threats. The research details a systematic progression mirroring traditional malware campaigns, beginning with Initial Access via prompt injection, followed by Privilege Escalation, often achieved through jailbreaking the LLM. Experiments revealed that promptware attacks consistently move through stages of Persistence, establishing footholds through memory and retrieval poisoning, before attempting Lateral Movement across systems and users. This comprehensive analysis demonstrates that LLM-based attacks are not isolated incidents but rather coordinated sequences of malicious actions.

👉 More information
🗞 The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multi-Step Malware
🧠 ArXiv: https://arxiv.org/abs/2601.09625

Tags:

Data Exfiltration kill chain large language model lateral movement memory poisoning privilege escalation prompt injection promptware retrieval poisoning Threat Modeling

Promptware Kill Chain Advances Security Analysis of Multi-Step Malware Attacks on Large Language Models

Promptware Attacks and the LLM Kill Chain

Promptware Kill Chain for LLM Attacks

Promptware Kill Chain in Large Language Models

Rohail T.

Latest Posts by Rohail T.:

Protected: Models Achieve Reliable Accuracy and Exploit Atomic Interactions Efficiently

Protected: Quantum Computing Tackles Fluid Dynamics with a New, Flexible Algorithm

Protected: Silicon Unlocks Potential for Long-Distance Quantum Communication Networks