Tech 3 min read

NVIDIA announces “Vera” CPU for agent-based AI

IkesanContents

At GTC 2026, NVIDIA announced a new CPU “Vera” designed for agent-based AI. It achieves twice the power efficiency and 50% faster speed than conventional CPUs, and features an architecture that assumes tight coupling with the GPU.

Why do we need a dedicated CPU for AI?

While the GPU plays the leading role in AI inference workloads, the CPU plays supporting roles such as orchestration, data preprocessing, memory management, and agent state management. General-purpose CPUs are not optimized for these tasks and can easily become a bottleneck.

In particular, in agent-type AI (systems that autonomously decompose and execute tasks), multiple model calls and external API collaborations run continuously. In this case, the overall performance is determined by the throughput of orchestration processing to keep the GPU idle. It is in this context that CEO Jensen Huang has described it as “a turning point in the era of AI reasoning and action.”

Architecture details

Core configuration

The Vera CPU is equipped with 88 custom-designed “Olympus” cores. Each core can execute two tasks simultaneously using a technology called Spatial Multithreading. When a thread is waiting for memory in one task, it is designed to immediately switch to another task so that the computing unit is not freed up.

Memory Bandwidth

The comparison with conventional CPU is as follows.

SpecificationsConventional CPUVera CPU
Memory standardsDDR5LPDDR5X
Bandwidth~600GB/s1.2TB/s
Power consumptionStandardApproximately 50% reduction

LPDDR5X is a scaled-up low-power standard for mobile applications that balances bandwidth and power efficiency. In agent processing, a large amount of context data is frequently exchanged between the CPU and the GPU, so this improvement in bandwidth results in a direct performance difference.

GPU connection

NVLink-C2C is used to connect to the GPU, ensuring a bandwidth of 1.8TB/s. Unlike connections via PCIe, data transfer between the CPU and GPU is less likely to become a bottleneck.

flowchart LR
    A[Vera CPU<br/>88x Olympus Cores<br/>1.2TB/s LPDDR5X] -->|NVLink-C2C<br/>1.8TB/s| B[NVIDIA GPU<br/>H200/B200等]
    A --> C[エージェント<br/>オーケストレーション]
    A --> D[データ前処理<br/>後処理]
    B --> E[モデル推論<br/>学習]

Rack configuration and large-scale deployment

It is designed not only to improve performance on a standalone basis, but also for rack-scale deployment.

The new Vera CPU rack integrates 256 liquid-cooled Vera CPUs and can support up to 22,500 simultaneous CPU environments. The deployment of agent-based AI in data centers assumes a scenario where thousands of agents operate in parallel, and operation at this scale is a prerequisite.

Employer

At the time of the announcement, a wide range of companies had announced recruitment.

CategoriesCompanies
HyperscalersAlibaba, Meta, Oracle Cloud Infrastructure, CoreWeave
System manufacturersDell, HPE, Lenovo, Supermicro
CloudByteDance, Cloudflare, Crusoe

Cloudflare’s inclusion on this list is highly notable. The company is actively building agent-type AI infrastructure such as edge AI inference and MCP servers, which are compatible with Vera CPU use cases.

Main use cases

NVIDIA’s intended use for Vera CPU.

-Coding assistant (Copilot-based backend)

  • Agentic reasoning (execution of multi-step tasks)
  • Reinforcement learning (orchestration of learning loops) -Data processing pipeline
  • Multi-agent orchestration

Availability

Vera CPUs will be available through adoption partners in the second half of 2026.