Anthropic’s Claude Mythos Preview identifies vulnerabilities in code and chains them into complete cyberattack sequences, a capability that prompted a cautious rollout via Project Glasswing rather than broad public access. Researchers at Anthropic collaborated with benchmark developers to rigorously test the model, running Mythos Preview on an updated version of SCONE-bench alongside the newly created ExploitBench and ExploitGym. “Mythos Preview is capable of finding complex vulnerabilities, but what concerned us most in our internal testing was that Mythos Preview could both turn vulnerabilities into exploit primitives, and combine those primitives together into complete end-to-end attack chains,” stated Newton Cheng, Keane Lucas, Winnie Xiao, Nicholas Carlini, and Milad Nasr. This advancement suggests a significant lowering of the skill level needed to create exploits, potentially democratizing access to powerful cyberattack tools and increasing associated risks.

Mythos Preview Demonstrates Advanced Exploit Development Capabilities

Internal testing revealed the model’s capacity to locate vulnerabilities and, crucially, combine those primitives into complete, end-to-end attack chains, a level of sophistication previously unseen in large language models. Anthropic sought to rigorously quantify these capabilities, but existing benchmarks proved inadequate, leading to the development and adoption of new evaluation standards. Researchers ran Mythos Preview on an updated version of SCONE-bench and collaborated with teams behind ExploitBench and ExploitGym to assess its performance. ExploitBench, created by Seunghyun Lee and Professor David Brumley from Carnegie Mellon University and Bugcrowd, moves beyond simple vulnerability identification, focusing instead on the ability to construct complete exploits. Unlike prior benchmarks that merely confirmed a bug’s reproducibility, ExploitBench demands that models build enabling actions like arbitrary code execution (ACE).

The benchmark dissects exploit development into 16 distinct capabilities, verified programmatically, and utilizes a V8 benchmark based on 41 patched vulnerabilities in the widely-used V8 JavaScript and WebAssembly engine. Testing is conducted against security defenses, such as the V8 sandbox, with the highest scoring tier representing complete control over a browser tab. The results demonstrate a clear performance advantage for Mythos Preview. “Consistent with our previous findings on Mozilla Firefox, all language models can reach or trigger the given vulnerabilities, but only models since Claude Opus 4.6 make any progress in developing primitives inside the V8 sandbox,” Anthropic reports. Combining Baseline and Nudged variants, Mythos Preview achieved ACE on 21 out of 41 CVEs, while the next best model only managed 2, and that required a proprietary scaffold.

Qualitative analysis further highlights the model’s skill; in one instance, Mythos Preview generated a near-deterministic exploit for vulnerability CVE-, where existing exploits were probabilistic, a critical advantage for commercially valuable exploits. Lee noted that he had discussed the possibility of this exploit plan with the original author of the 1-day v8CTF exploit, but they quickly dismissed it due to the complexity of the approach.

ExploitBench Measures End-to-End Exploit Primitives in V8

The capabilities of large language models are now being rigorously assessed not just for creative tasks, but for their potential in cybersecurity, specifically exploit development. Earlier evaluations focused on confirming the reproducibility of known vulnerabilities, but a new wave of benchmarks is demanding more: the creation of complete, functional exploits from scratch. This shift in evaluation criteria reflects growing concern that increasingly sophisticated models could dramatically lower the barrier to entry for malicious actors. Anthropic, the company behind the Claude model family, has been assessing these capabilities, and its latest model, Mythos Preview, is proving to be a pivotal test case. Unlike previous tests, ExploitBench measures a language model’s ability to construct exploits, moving beyond simply identifying a bug to demonstrating its practical exploitation. The benchmark dissects the exploit development process into 16 distinct capabilities, categorized across five tiers.

This framework is applied to 41 patched vulnerabilities within the V8 JavaScript and WebAssembly engine, a critical component powering applications like Chrome, Node.js, and Electron. A key aspect of the testing involves challenging models against V8’s security defenses, including its sandbox, which isolates webpage code to prevent broader system compromise. Anthropic collaborated with the researchers who produced these benchmarks to measure Mythos Preview’s performance and also ran it on an updated version of SCONE-bench, a smart contract exploitation benchmark developed in collaboration with MATS and the Anthropic Fellows Program, alongside the newly developed ExploitGym, to comprehensively evaluate the model. Anthropic reports that Mythos executed this cleanly and flawlessly without any publicly available information on this specific exploit technique, prompting the company to release Mythos Preview through Project Glasswing, a controlled rollout designed to mitigate potential misuse, rather than a general release.

Whereas the strongest models from February of this year could only barely develop exploits in simulated scenarios with most defense measures disabled, Mythos Preview is able to construct full end-to-end exploits on the world’s most widely-used software.

V8 Engine Benchmarking: Bypassing Sandbox Defenses for ACE

Anthropic’s Claude Mythos Preview is redefining the capabilities assessed in vulnerability research, prompting a shift in how security is evaluated and raising concerns about the accessibility of sophisticated exploit development. Initial evaluations focused on confirming whether a model could reproduce known vulnerabilities, but the focus has rapidly evolved to testing a model’s ability to construct complete attack chains. js, and Electron. A key element of the framework is testing against security defenses; the V8 sandbox is designed to isolate JavaScript objects and prevent vulnerabilities from escalating into broader system compromises. The benchmark dissects exploit development into 16 distinct capabilities, categorized into five tiers, culminating in arbitrary code execution (ACE), the highest level of control. Researchers noted that escaping the V8 sandbox, going from T3 to T2, is a significant challenge, and Mythos Preview is the only tested model that can reliably achieve this in over half of tested environments.

ExploitGym Evaluates Exploitation Across Diverse Software Targets

Concerns stemmed from internal testing revealing Mythos Preview’s capacity not only to identify vulnerabilities but to construct complete attack chains from them, a capability previously requiring significant expertise. It specifically assesses a language model’s ability to write complete, end-to-end exploits, decomposing the process into 16 distinct capabilities verified programmatically. js, and VS Code. Complementing ExploitBench, ExploitGym expands the scope of evaluation to a wider range of software targets, encompassing 898 patched vulnerabilities across OSS-Fuzz, V8, and the Linux kernel. Developed by researchers at UC Berkeley, the Max Planck Institute for Security and Privacy, UC Santa Barbara, and Arizona State University, ExploitGym tasks models with developing exploits that achieve unauthorized code execution and retrieve a dynamically generated flag.

Mythos Preview Achieves Stability and Precision in CVE Exploits

The assumption that developing functional exploits requires years of specialized knowledge is increasingly challenged by advances in artificial intelligence; recent results from Anthropic demonstrate a significant shift in the landscape of vulnerability exploitation. This ability to chain vulnerabilities together represents a substantial leap beyond simply confirming a bug’s existence. Anthropic proactively sought rigorous, quantitative benchmarks to assess Mythos Preview’s performance, moving beyond qualitative evaluations. Recognizing the limitations of existing tools, the team ran Mythos Preview on an updated version of SCONE-bench, designed for smart contract exploitation, and collaborated with the researchers who produced ExploitBench and ExploitGym. Across all three benchmarks, Mythos Preview consistently outperformed other evaluated models. Anthropic researchers anticipate that the increasing accessibility of Mythos-level capabilities will significantly lower the barrier to entry for exploit development, potentially democratizing and increasing the risk of cyberattack tools.

I have privately discussed the possibility of precisely this exploit plan with the original author of the 1-day v8CTF exploit, which we quickly dismissed due to the complexity of the approach. Mythos executed this cleanly and flawlessly without any publicly available information on this specific exploit technique.

Source: https://red.anthropic.com/2026/exploit-evals/

Stay current. See today’s quantum computing news on Quantum Zeitgeist for the latest breakthroughs in qubits, hardware, algorithms, and industry deals.

Tags:

Anthropic Mythos Preview

Mythos Preview Surpasses SCONE-bench Results in Smart Contract Exploitation

Mythos Preview Demonstrates Advanced Exploit Development Capabilities

ExploitBench Measures End-to-End Exploit Primitives in V8

V8 Engine Benchmarking: Bypassing Sandbox Defenses for ACE

ExploitGym Evaluates Exploitation Across Diverse Software Targets

Mythos Preview Achieves Stability and Precision in CVE Exploits

Ivy Delaney

Latest Posts by Ivy Delaney:

UCD Researchers Detail Critical Quantum Sensing Protocols

Cheb-LCU Cuts Quantum Resources 10× in Rolls-Royce CFD Tests

SuperQ Quantum Seeks Up To C$4.0M With LIFE Financing