Vera CPU Delivers 50% Faster Per-Core Performance for AI

NVIDIA Vice President of Hyperscale and High-Performance Computing Ian Buck delivered the first Vera CPUs to Anthropic, OpenAI, and SpaceXAI on Friday, marking a shift in the company’s strategy and a new focus on CPU design for artificial intelligence. Unlike traditional approaches prioritizing increased core density, the Vera CPU packs 88 custom NVIDIA-designed Olympus cores and boasts 50% faster per-core performance, addressing the growing demands of “agentic AI.” “Agentic AI is creating a new opportunity in the AI factory — as models move from answering to acting, Vera is purpose-built to keep that work moving at scale,” Buck said. This new class of CPU is designed to handle the increasing CPU workload associated with AI agents, tool calls, orchestration, and data retrieval, complementing GPU acceleration.

NVIDIA Vera CPU: Addressing Demands of Agentic AI

NVIDIA’s new Vera CPU features 88 custom-designed Olympus cores, a deliberate shift away from simply increasing core counts and toward an architecture optimized for the unique demands of agentic AI systems. This isn’t about brute force; it’s about a fundamental rethinking of how CPUs handle the emerging workloads that underpin increasingly autonomous AI. The impetus behind Vera stems from the realization that agentic AI, where models transition from answering questions to acting on them, places unprecedented strain on CPU resources. Traditional CPU designs, focused on maximizing core density, struggle with the concurrent, real-time demands of these agent-driven processes. Vera addresses this with a claimed 50% faster per-core performance, allowing for quicker completion of tasks and improved overall efficiency.

Oracle Cloud Infrastructure (OCI) is already preparing to deploy hundreds of thousands of the new CPUs, recognizing the potential for sustained performance at scale. “Vera’s architecture is purpose-built for high-throughput reasoning workloads, delivering the efficiency, density and footprint OCI needs to power the next generation of enterprise AI,” explained Karan Batta, who leads overall product management at OCI. The CPU’s ability to rapidly generate code, a critical function for AI agents, is a key differentiator. Buck continued, explaining that when AI models are posed a question, the answer often isn’t prepped and ready to go; the models actually have to generate some Python code to arrive at the correct answer. This capability, coupled with its integration into NVIDIA’s broader ecosystem including the Rubin GPU, positions Vera as a foundational component of the next wave of AI infrastructure.

Core Olympus Architecture & Performance Specifications

The shift towards agentic artificial intelligence is driving demand for specialized hardware, and NVIDIA’s Vera CPU represents a departure from conventional designs prioritizing core density. Packing 88 custom NVIDIA-designed Olympus cores into a single processor, Vera aims to address the increasing computational load placed on CPUs by AI agents, tasks like tool calls, orchestration, and data retrieval that complement, rather than replace, GPU acceleration. This isn’t simply about adding more cores; it’s about optimizing the architecture for the specific demands of a new class of AI workload. NVIDIA claims a 50% faster per-core performance for Vera, a figure intended to alleviate bottlenecks created by the constant, concurrent demands of agentic systems. Every agentic sandbox and every long-context retrieval operation requires CPU work, according to NVIDIA. The CPU’s 1.2 terabytes per second of memory bandwidth further supports this goal, ensuring rapid data access for real-time processing under sustained load.

This emphasis on efficiency is critical, as “scaling compute is an important accelerant for the growth of models,” explains James Bradbury, Anthropic’s head of compute, who received one of the first Vera CPUs. Buck detailed how Vera excels at generating Python code, a crucial function for AI agents responding to complex queries.

Agentic AI is creating a new CPU moment in the AI factory – as models move from answering to acting, Vera is purpose-built to keep that work moving at scale.

Vera CPU Deployment: Initial Labs & Oracle Cloud Infrastructure

Buck delivered the first CPUs personally to Anthropic in San Francisco, OpenAI in Mission Bay, and SpaceXAI in Palo Alto before extending the rollout to Oracle Cloud Infrastructure in Santa Clara on Monday. This delivery method underscores NVIDIA’s commitment to these early adopters and their agentic AI workloads. SpaceXAI’s evaluation will focus on reinforcement learning and agent-based simulation pipelines, with Elon Musk specifically questioning the CPU’s core count, memory layout, and cooling system. Buck explained that agentic AI often requires on-the-fly code generation.

OCI plans to deploy hundreds of thousands of NVIDIA Vera CPUs beginning in because agentic AI demands sustained performance at massive scale.

Stay current. See today’s quantum computing news on Quantum Zeitgeist for the latest breakthroughs in qubits, hardware, algorithms, and industry deals.
The Quant

The Quant

The Quant possesses over two decades of experience in start-up ventures and financial arenas, brings a unique and insightful perspective to the quantum computing sector. This extensive background combines the agility and innovation typical of start-up environments with the rigor and analytical depth required in finance. Such a blend of skills is particularly valuable in understanding and navigating the complex, rapidly evolving landscape of quantum computing and quantum technology marketplaces. The quantum technology marketplace is burgeoning, with immense growth potential. This expansion is not just limited to the technology itself but extends to a wide array of applications in different industries, including finance, healthcare, logistics, and more.

Latest Posts by The Quant: