NVIDIA announces “Vera” CPU for agent-based AI

At GTC 2026, NVIDIA announced a new CPU “Vera” designed for agent-based AI. It achieves twice the power efficiency and 50% faster speed than conventional CPUs, and features an architecture that assumes tight coupling with the GPU.

Why do we need a dedicated CPU for AI?

While the GPU plays the leading role in AI inference workloads, the CPU plays supporting roles such as orchestration, data preprocessing, memory management, and agent state management. General-purpose CPUs are not optimized for these tasks and can easily become a bottleneck.

In particular, in agent-type AI (systems that autonomously decompose and execute tasks), multiple model calls and external API collaborations run continuously. In this case, the overall performance is determined by the throughput of orchestration processing to keep the GPU idle. It is in this context that CEO Jensen Huang has described it as “a turning point in the era of AI reasoning and action.”

Architecture details

Core configuration

The Vera CPU is equipped with 88 custom-designed “Olympus” cores. Each core can execute two tasks simultaneously using a technology called Spatial Multithreading. When a thread is waiting for memory in one task, it is designed to immediately switch to another task so that the computing unit is not freed up.

Memory Bandwidth

The comparison with conventional CPU is as follows.

Specifications	Conventional CPU	Vera CPU
Memory standards	DDR5	LPDDR5X
Bandwidth	~600GB/s	1.2TB/s
Power consumption	Standard	Approximately 50% reduction

LPDDR5X is a scaled-up low-power standard for mobile applications that balances bandwidth and power efficiency. In agent processing, a large amount of context data is frequently exchanged between the CPU and the GPU, so this improvement in bandwidth results in a direct performance difference.

GPU connection

NVLink-C2C is used to connect to the GPU, ensuring a bandwidth of 1.8TB/s. Unlike connections via PCIe, data transfer between the CPU and GPU is less likely to become a bottleneck.

flowchart LR
    A[Vera CPU<br/>88x Olympus Cores<br/>1.2TB/s LPDDR5X] -->|NVLink-C2C<br/>1.8TB/s| B[NVIDIA GPU<br/>H200/B200等]
    A --> C[エージェント<br/>オーケストレーション]
    A --> D[データ前処理<br/>後処理]
    B --> E[モデル推論<br/>学習]

Rack configuration and large-scale deployment

It is designed not only to improve performance on a standalone basis, but also for rack-scale deployment.

The new Vera CPU rack integrates 256 liquid-cooled Vera CPUs and can support up to 22,500 simultaneous CPU environments. The deployment of agent-based AI in data centers assumes a scenario where thousands of agents operate in parallel, and operation at this scale is a prerequisite.

Employer

At the time of the announcement, a wide range of companies had announced recruitment.

Categories	Companies
Hyperscalers	Alibaba, Meta, Oracle Cloud Infrastructure, CoreWeave
System manufacturers	Dell, HPE, Lenovo, Supermicro
Cloud	ByteDance, Cloudflare, Crusoe

Cloudflare’s inclusion on this list is notable. The company is actively building agent-type AI infrastructure such as edge AI inference and MCP servers, which are highly compatible with Vera CPU use cases.

Main use cases

NVIDIA’s intended use for Vera CPU.

Coding assistant (Copilot-based backend)
Agentic reasoning (execution of multi-step tasks)
Reinforcement learning (orchestration of learning loops)
Data processing pipeline
Multi-agent orchestration

Availability

Vera CPUs will be available through adoption partners in the second half of 2026.