NVIDIA announces “Vera” CPU for agent-based AI
Contents
At GTC 2026, NVIDIA announced a new CPU “Vera” designed for agent-based AI. It achieves twice the power efficiency and 50% faster speed than conventional CPUs, and features an architecture that assumes tight coupling with the GPU.
Why do we need a dedicated CPU for AI?
While the GPU plays the leading role in AI inference workloads, the CPU plays supporting roles such as orchestration, data preprocessing, memory management, and agent state management. General-purpose CPUs are not optimized for these tasks and can easily become a bottleneck.
In particular, in agent-type AI (systems that autonomously decompose and execute tasks), multiple model calls and external API collaborations run continuously. In this case, the overall performance is determined by the throughput of orchestration processing to keep the GPU idle. It is in this context that CEO Jensen Huang has described it as “a turning point in the era of AI reasoning and action.”
Architecture details
Core configuration
The Vera CPU is equipped with 88 custom-designed “Olympus” cores. Each core can execute two tasks simultaneously using a technology called Spatial Multithreading. When a thread is waiting for memory in one task, it is designed to immediately switch to another task so that the computing unit is not freed up.
Memory Bandwidth
The comparison with conventional CPU is as follows.
| Specifications | Conventional CPU | Vera CPU |
|---|---|---|
| Memory standards | DDR5 | LPDDR5X |
| Bandwidth | ~600GB/s | 1.2TB/s |
| Power consumption | Standard | Approximately 50% reduction |
LPDDR5X is a scaled-up low-power standard for mobile applications that balances bandwidth and power efficiency. In agent processing, a large amount of context data is frequently exchanged between the CPU and the GPU, so this improvement in bandwidth results in a direct performance difference.
GPU connection
NVLink-C2C is used to connect to the GPU, ensuring a bandwidth of 1.8TB/s. Unlike connections via PCIe, data transfer between the CPU and GPU is less likely to become a bottleneck.
flowchart LR
A[Vera CPU<br/>88x Olympus Cores<br/>1.2TB/s LPDDR5X] -->|NVLink-C2C<br/>1.8TB/s| B[NVIDIA GPU<br/>H200/B200等]
A --> C[エージェント<br/>オーケストレーション]
A --> D[データ前処理<br/>後処理]
B --> E[モデル推論<br/>学習]
Rack configuration and large-scale deployment
It is designed not only to improve performance on a standalone basis, but also for rack-scale deployment.
The new Vera CPU rack integrates 256 liquid-cooled Vera CPUs and can support up to 22,500 simultaneous CPU environments. The deployment of agent-based AI in data centers assumes a scenario where thousands of agents operate in parallel, and operation at this scale is a prerequisite.
Employer
At the time of the announcement, a wide range of companies had announced recruitment.
| Categories | Companies |
|---|---|
| Hyperscalers | Alibaba, Meta, Oracle Cloud Infrastructure, CoreWeave |
| System manufacturers | Dell, HPE, Lenovo, Supermicro |
| Cloud | ByteDance, Cloudflare, Crusoe |
Cloudflare’s inclusion on this list is highly notable. The company is actively building agent-type AI infrastructure such as edge AI inference and MCP servers, which are compatible with Vera CPU use cases.
Main use cases
NVIDIA’s intended use for Vera CPU.
-Coding assistant (Copilot-based backend)
- Agentic reasoning (execution of multi-step tasks)
- Reinforcement learning (orchestration of learning loops) -Data processing pipeline
- Multi-agent orchestration
Availability
Vera CPUs will be available through adoption partners in the second half of 2026.