Tech 9 min read

Cloudflare Agents Week 2026: Project Think, Browser Run Overhaul, and Workflows v2

IkesanContents

Three announcements from Day 3 (April 15) of Cloudflare Agents Week 2026.

AnnouncementSummary
Project ThinkAn agent framework layer on top of the Agents SDK
Browser RunBrowser Rendering renamed, with major AI agent features added
Workflows v2Control plane redesign, 11x concurrency

For Day 1 and Day 2 announcements, see Sandboxes GA, Durable Object Facets, and the Unified CLI and Mesh and Enterprise MCP Reference Architecture.

Project Think

Cloudflare announced “Project Think,” a next-generation Agents SDK preview. Where the existing agents package was a collection of lightweight primitives, Project Think aims to be a “batteries-included” platform that handles thinking, acting, and persistence end-to-end.

Three Waves of Agents

The announcement organizes the evolution of AI agents into three waves.

WaveGenerationCharacteristics
1ChatbotsStateless, reactive
2Coding agentsStateful, tool-using. Claude Code, Cursor, etc.
3Agents as infrastructureFault-tolerant, distributed, serverless

The proliferation of coding agents has created a design gap between “chatbots that call an LLM once” and “agents that run autonomously for tens of minutes.” A single LLM call takes 30 seconds; a multi-turn agent loop runs for minutes to tens of minutes. What happens if the process crashes in the middle? Where does state get saved? How do you run multiple subtasks in parallel? Project Think is a framework for these Wave 3 challenges.

Fibers (Fault-Tolerant Execution)

A mechanism for “functions that can resume from where they left off after a crash.” Calling runFiber() records the function’s invocation info in SQLite before execution begins. During execution, ctx.stash() saves checkpoints, and onFiberRecovered fires on restart after a crash.

async startResearch(topic: string) {
  void this.runFiber("research", async (ctx) => {
    const findings = [];
    for (let i = 0; i < 10; i++) {
      const result = await this.callLLM(`Research step ${i}: ${topic}`);
      findings.push(result);
      ctx.stash({ findings, step: i, topic }); // Save checkpoint
      this.broadcast({ type: "progress", step: i });
    }
    return { findings };
  });
}

async onFiberRecovered(ctx) {
  if (ctx.name === "research" && ctx.snapshot) {
    const { topic } = ctx.snapshot;
    await this.startResearch(topic); // Resume with previous topic
  }
}

While a Fiber is running, the SDK automatically keeps the agent alive (equivalent to keepAlive()). For long-running tasks like video generation or CI pipelines, an agent can persist a job ID, hibernate, and wake up on callback.

Sub-Agents

A mechanism for co-locating child Durable Objects with the parent agent, each with independent SQLite and typed RPC. Built on top of Durable Object Facets.

export class Orchestrator extends Agent {
  async handleTask(task: string) {
    const researcher = await this.subAgent(ResearchAgent, "research");
    const reviewer = await this.subAgent(ReviewAgent, "review");

    const [research, review] = await Promise.all([
      researcher.search(task),
      reviewer.analyze(task)
    ]);

    return this.synthesize(research, review);
  }
}

Each sub-agent has its own conversation tree and tool set. RPC supports streaming callbacks, so child agent output can be streamed to the parent in real time.

Session API (Conversation Persistence)

An experimental API that stores conversation history as a tree structure. Messages have parent-child relationships (parent_id), enabling the following non-destructive operations.

OperationDescription
ForkingTry a different approach from mid-conversation. The original conversation remains intact
CompactionReplace old messages with summaries (without deleting them) to save context
Full-text searchSQLite FTS5 enables searching across the entire conversation history
import { Session, SessionManager } from "agents/experimental/memory/session";

export class MyAgent extends Agent {
  sessions = SessionManager.create(this);

  async onStart() {
    const session = this.sessions.create("main");
    const forked = this.sessions.fork(session.id, messageId, "alternative-approach");
  }
}

The search_context tool powered by FTS5 lets an agent search its own past conversations.

Sandbox Code Execution (Execution Ladder)

Code execution capabilities available to agents are defined in five tiers.

TierNameCapabilitiesTechnology
0WorkspaceFile read/write, grep, diff@cloudflare/shell + SQLite + R2
1Dynamic WorkerSandboxed JS execution (no network)V8 isolate, @cloudflare/codemode
2npm Runtimenpm packages resolved at runtime@cloudflare/worker-bundler + esbuild
3Headless BrowserNavigate, click, screenshotCloudflare Browser Run
4SandboxFull OS (git, compilers, test runners)Cloudflare Sandbox

The design principle is explicit: “Tier 0 alone should be practically useful, and each tier adds capabilities incrementally.” Dynamic Worker (Tier 1) V8 isolates start roughly 100x faster than containers.

With traditional tool calling, an agent calls one tool per endpoint sequentially. Writing a complete program and running it in a Dynamic Worker instead can drastically reduce token consumption. Cloudflare claims “99.9% reduction.”

Self-Authored Extensions

A mechanism where agents write their own tools in TypeScript, declare permissions in a JSON manifest, and bundle them into a Dynamic Worker.

{
  "name": "github",
  "description": "GitHub integration: PRs, issues, repos",
  "tools": ["create_pr", "list_issues", "review_pr"],
  "permissions": {
    "network": ["api.github.com"],
    "workspace": "read-write"
  }
}

The default is globalOutbound: null (no network connectivity), with permissions granted explicitly per resource. Cloudflare describes this as “structural security through a capability model rather than behavioral constraints.” Extensions persist in Durable Object storage and survive hibernation.

The Think Base Class

The centerpiece of Project Think is the Think class from the @cloudflare/think package. It’s an opinionated harness that manages the entire chat lifecycle — streaming, persistence, tool execution, and stream resumption. The minimal implementation only requires overriding getModel().

import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";

export class MyAgent extends Think<Env> {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.5"
    );
  }
}

Overridable methods and hooks:

MethodPurpose
getModel()Specify the model to use
getSystemPrompt()Define the system prompt
getTools()Declare available tools
maxStepsSet the agent loop limit
configureSession()Configure memory, compaction, and search

Lifecycle hook execution order:

beforeTurn()
  → streamText()
    → beforeToolCall()
    → afterToolCall()
  → onStepFinish()
→ onChatResponse()

Context Blocks (Persistent Memory)

Structured sections within the system prompt that the model can read and write across hibernation cycles. A token occupancy indicator (MEMORY (42%, 462/1100 tokens)) is displayed, and the model actively updates content via the set_context tool.

configureSession(session: Session) {
  return session
    .withContext("soul", {
      provider: {
        get: async () => "You are a helpful coding assistant."
      }
    })
    .withContext("memory", {
      description: "Important facts learned during conversation.",
      maxTokens: 2000
    })
    .withCachedPrompt();
}

Cost Model Shift

Using Durable Objects as the unit of agent identity and persistence changes the scaling cost structure.

MetricVM/ContainerDurable Objects
Idle costAlways fully chargedZero (hibernation)
10,000 agents at 1% utilization10,000 instances~100 instances

Cloudflare states that “one agent per task” and “one agent per customer” models become cost-realistic. When migrating from existing AIChatAgent to Think, no client-side code changes are required.

Installation

npm install @cloudflare/think agents ai @cloudflare/shell zod workers-ai-provider
npx wrangler deploy

The API is stable but considered an evolving preview. Thousands of production agents are already running on the existing Agents SDK.

Browser Run

“Browser Rendering” has been renamed to “Browser Run,” with a batch of features released under the assumption that AI agents will be the primary browser users. The rename is more than cosmetic: Live View, Human in the Loop, direct CDP endpoint exposure, WebMCP support, session recording, and a 4x increase in concurrency all shipped at once.

For existing Browser Rendering users, existing APIs remain unchanged — new features are additive.

Live View

A feature for observing an agent’s running browser session in real time. Page content, DOM structure, console logs, and network requests are all visible.

The difficulty of debugging “why did the agent fail?” is a long-standing problem in browser automation. Headless browsers like Lightpanda don’t even allow visual inspection. Live View enables real-time diagnostics while the browser session is still alive.

Human in the Loop

A workflow where the agent hands control to a human when it encounters a login page or unexpected form, and resumes after the human completes the interaction. A practical workaround for scenarios where the agent can’t hold credentials, such as MFA or internal authentication.

A future update will add the ability for agents to send a “request assistance” signal. For now, the handoff is done manually through Live View.

Direct CDP Endpoint Exposure

Chrome DevTools Protocol (CDP) is a low-level protocol for controlling browsers externally. Puppeteer and Playwright are high-level wrappers built on top of it. Chrome DevTools MCP also sits on CDP.

Browser Run’s direct CDP endpoint exposure means existing CDP scripts can run on Cloudflare with a one-line change.

// Before (self-hosted Chrome)
const browser = await puppeteer.connect({
  browserWSEndpoint: 'ws://localhost:9222/devtools/browser'
});

// After (Browser Run)
const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/browser-rendering/devtools/browser'
});

Eliminating the need to manage Chromium version updates and security patches is another significant benefit.

WebMCP Support

WebMCP is a new browser API proposed by Google’s Chrome team that allows websites to structurally expose tools for AI agents.

Instead of taking screenshots and asking a Vision model to “find the button,” an agent can retrieve the list of actions a site provides and call them directly.

// Get the list of tools the site exposes
const tools = await navigator.modelContextTesting.listTools();

// Execute directly
await navigator.modelContextTesting.executeTool("searchFlights", {
  origin: "NRT", destination: "SFO", date: "2026-05-01"
});

To use WebMCP with Browser Run, add lab=true to the /devtools/browser request to use the experimental pool (Chrome beta runtime). Once WebMCP gains adoption, browser automation can shift from the fragile “parse pages and click buttons” approach to “call site-provided APIs with type safety.”

Session Recording and Scaling

A feature to record entire browser sessions as JSON was also added. DOM changes, keyboard/mouse events, and page navigations are all captured and can be replayed later with rrweb-player. Useful for both debugging and compliance.

Scaling also changed significantly.

MetricBeforeAfter
Concurrent browsers (default limit)30120 (4x)
Quick Actions rate limitNot specified10 requests/sec
Cold startPresentGlobal access to warm instances

A /crawl endpoint is also available, returning an entire site’s content as HTML, Markdown, or structured JSON from a single API call. Available on both Workers Free and Paid plans.

Workflows v2

Workflows was originally designed for low-frequency flows like user registration or order processing. The rise of agents broke that assumption. A single agent session can spawn dozens of workflow instances, and multiple agents running in parallel can generate thousands of instances in seconds. A shift from human speed to machine speed.

V1 Bottleneck

In V1, a single Account Durable Object (DO) managed all workflow instance metadata for the entire account. DOs are single-threaded, and when thousands of requests concentrate in a few seconds in high-throughput environments, it becomes a bottleneck — the classic hot spot problem.

Additionally, instances could be queued without verifying that the Engine DO actually existed, leading to inconsistent states.

SousChef and Gatekeeper

V2 introduces two new components.

SousChef is a DO that acts as a lieutenant to the Account DO. It handles per-workflow instance metadata and lifecycle management, distributing the request load away from the Account DO. As a side effect, high load on one workflow no longer affects others.

Gatekeeper is a DO that distributes concurrency slots across all SousChefs. It operates in batch cycles every second, with just one JSRPC call per cycle. Slot allocation follows max-min fairness, preventing any single workflow from monopolizing slots.

Instance Creation Flow

sequenceDiagram
    participant C as Client
    participant CP as Control Plane
    participant E as Engine DO
    participant SC as SousChef
    participant GK as Gatekeeper

    C->>CP: Instance creation request
    CP->>CP: Check control plane version
    CP->>CP: Check cached workflow info
    CP->>E: Save minimal metadata
    E->>E: Set Alarm
    E-->>SC: Background metadata sync
    SC-->>GK: Report concurrency slots
    CP-->>C: Return instance ID

The hot path is narrowed to a direct write to the Engine DO, with SousChef and Gatekeeper synchronization happening in the background. Combined with DO alarms, background tasks retry on an at-least-once model even if they fail. A design that lightens the hot path without sacrificing reliability.

Scaling Numbers

MetricV1V2
Concurrent instances4,50050,000
Instance creation rate100/10 sec300/sec (per account)
Queued instances1M/workflow2M/workflow

Concurrency increased roughly 11x, and creation rate roughly 30x in burst terms.

Live Migration

To migrate existing users with zero impact, V1’s Account DOs were re-purposed as SousChefs. By keeping the SQL table structure identical, large-scale data migration was avoided — only code paths needed switching.

  1. Release that makes Account Old SousChef-compatible
  2. Enable V2 per-account in stages
  3. Route new instances to new SousChefs, auto-convert existing Account DOs
  4. Sunset Account Old after retention period

Millions of instances were migrated with zero downtime. Cloudflare’s blog described it as “changing tires on a moving car.”

Control Plane Pattern

This redesign is also an instance of Cloudflare applying the “control plane / data plane separation pattern” — a pattern they’ve documented as an official reference architecture — to Workflows itself.

If Durable Object Facets represent data plane evolution (“assign independent SQLite as a DO for each piece of AI-generated code”), Workflows v2 tackles the control plane scaling problem head-on.


Across three days of announcements, Cloudflare redesigned the entire stack around the assumption that agents will run as always-on infrastructure. Days 1 and 2 covered the execution environment and networking; Day 3 was about the framework and scaling that sits on top.