OpenAI Codex switched to per-token pricing: new rate card, $20 Business seat, and how it now diverges from GitHub Copilot

On April 2, 2026, OpenAI revised the pricing structure for Codex. For ChatGPT Business and Enterprise customers, the old “N credits per message” system was retired in favor of credit billing tied to API token consumption. Details are available on the official rate card.

At the same time, ChatGPT Business seat pricing was dropped from $25 to $20, and a new Codex-only seat was introduced. The Codex-only seat is fully usage-based with no rate limits — you pay credits for exactly what you use. New workspaces also get promotional credits of $100 per user, up to $500 (expires April 30).

The Old System Was Structurally Identical to GitHub Copilot’s “Premium Requests”

Looking back at Codex’s old pricing, it was nearly identical in structure to GitHub Copilot’s current billing model.

Copilot meters AI model usage through “premium requests.” Each plan has a monthly cap, and each model has a “multiplier.” More powerful models consume more of your quota per request.

Copilot Plan	Monthly Price	Premium Requests/Month
Free	$0	50
Pro	$10	300
Pro+	$39	1,500
Business	$19/user	300/user
Enterprise	$39/user	1,000/user

Model-specific multipliers look like this:

Model	Multiplier
GPT-4.1, GPT-4o, GPT-5 mini	0 (free, unlimited)
Claude Haiku 4.5, Gemini 3 Flash, GPT-5.4 mini	0.33
Claude Sonnet 4.6, Gemini 3.1 Pro, GPT-5.4	1
Claude Opus 4.6	3
Claude Opus 4.6 (Fast Mode)	30

With Copilot Pro’s 300-request monthly quota, you can make 300 calls with Claude Sonnet 4.6 (1x), 100 with Claude Opus 4.6 (3x), or unlimited with GPT-4.1 (0x). Going over the cap incurs $0.04 per additional request.

graph TD
    A[Subscription<br/>Fixed monthly fee] --> B[Premium Requests<br/>N requests/month]
    B --> C[GPT-4.1<br/>0x = free]
    B --> D[Claude Sonnet<br/>1x]
    B --> E[Claude Opus<br/>3x]
    B --> F[Over quota<br/>\$0.04/request]

Codex’s old system followed the same pattern — a credit quota bundled with the monthly subscription, with consumption varying by model and task type.

Codex (Old)	Local Task	Cloud Task	Code Review
GPT-5.4	~7 credits	~34 credits	~34 credits
GPT-5.3-Codex	~5 credits	~25 credits	~25 credits
GPT-5.1-Codex-Mini	~1 credit	N/A	N/A

The three-layer structure of “fixed monthly fee + model-based usage quota + overage charges” was exactly the same for both Copilot and old Codex. What Copilot calls “premium requests × multiplier,” Codex called “messages × estimated credits.”

Why Per-Message Billing Broke Down

The system broke down for Codex because coding task sizes vary too wildly.

A 10-line bug fix and a full-repo refactor consume orders-of-magnitude different token counts. Under message-based billing, both counted as “1 message” — users running lots of small tasks overpaid, while users throwing large tasks at Codex underpaid. The cloud task multiplier (estimated at 5–7x the local rate) was also approximate, making it hard to predict actual costs.

GitHub Copilot’s premium requests have the same structural weakness. In Copilot’s agent mode (Coding Agent), one session counts as “1 premium request × multiplier,” but tool calls the agent makes autonomously within that session aren’t counted. A session that finishes in 10 steps costs the same as one that takes 100 steps. As usage grows and diversifies, this granularity gap distorts the cost structure.

Once Codex hit 2 million weekly users, OpenAI could no longer ignore the distortion.

The New Rate Card

Under the new system, credit consumption is defined per million tokens for each model.

Model	Input	Cached Input	Output
GPT-5.4	62.50	6.250	375
GPT-5.3-Codex	43.75	4.375	350
GPT-5.4-Mini	18.75	1.875	113
GPT-5.1-Codex-mini	6.25	0.625	50

Units are credits per 1M tokens. Fast Mode doubles the credit consumption.

The cached input discount stands out — it’s 1/10 the regular input rate. Codex repeatedly sends context for the same repository, so cache hit rates tend to be high. Prompt caching reuses the portions of input (system prompts, repo context) that overlap with the previous request on the server side, and it kicks in more the longer you work on the same project. When this discount applies, costs can drop significantly compared to the message-based era.

On the other hand, GPT-5.4’s output tokens cost 375 credits per million, which is steep. Tasks that demand heavy code generation will see costs spike. It’s a design where your mileage varies by usage pattern.

Comparison with Direct API Usage

Codex can be used via ChatGPT authentication or directly with an API key. With an API key, billing is in dollars per token rather than credits.

Model	API Input ($/1M)	API Cached Input ($/1M)	API Output ($/1M)
gpt-5.1-codex-mini	$0.25	$0.025	$2.00
gpt-5.3-codex	$1.75	$0.175	$14.00
gpt-5.4	$2.50	$0.25	$15.00

API key usage is simpler in terms of rates, but you can’t use the quota bundled with ChatGPT plans (Plus $20, Pro $200). For teams, going through Business makes sense for centralized management, but individual developers with predictable usage might prefer direct API access.

Per-Plan Quotas (Legacy)

Plus, Pro, and existing Enterprise/Edu customers remain on the legacy message-based system. No timeline has been announced for migrating them to the new rate card.

Plan	GPT-5.4 Messages	GPT-5.3-Codex Messages	Window
Plus	33–168	45–225	5 hours
Pro	223–1,120	300–1,500	5 hours (6x Plus)
Business	15–60	20–90	5 hours

The ranges exist because consumption varies by task type (local/cloud). Enterprise/Edu have no fixed rate limits.

Business has the lowest message counts because it has already migrated to token-based billing under the new system. The legacy Business quota doesn’t apply to new customers.

Codex and Copilot: Diverging Billing Models

OpenAI and GitHub (Microsoft) have a tight partnership. Copilot was originally built on the first-generation OpenAI Codex model (a GPT-3-family code generation model, distinct from the current coding agent). GPT-5-family models still power Copilot’s backend. Both products had adopted the same “message/request-based credit” structure for billing.

That changed with this pricing update.

graph TD
    A[Shared Origin<br/>Message-based credits] --> B[OpenAI Codex<br/>April 2026 revision]
    A --> C[GitHub Copilot<br/>As of April 2026]
    B --> D[Per-token billing<br/>Proportional to resources consumed]
    C --> E[Premium requests retained<br/>Monthly quota + multiplier]

The divergence reflects differences in the products themselves.

Codex operates as an autonomous coding agent in the cloud, working across entire repositories. A single session might consume tens of thousands of tokens or just a few hundred. This variance is what made per-message billing untenable.

Copilot’s core use case is inline code completion and chat in the IDE. Token consumption per interaction doesn’t swing as wildly as Codex. The premium request system works well enough, and “300 requests/month” or “1,500 requests/month” gives users a clearer picture of their costs.

That said, Copilot keeps adding agent features — Coding Agent, parallel agents via /fleet, and more. As agents that run autonomously for extended periods become more common, Copilot will face the same problem. A move to token-based billing is a real possibility.

Three-Way Billing Comparison

A few days later on April 4, Anthropic took the opposite approach — locking third-party harnesses out of Claude Code subscriptions. They restricted subscription-based access and pushed third parties toward API usage-based billing. Lining up all three reveals distinct approaches to billing granularity and direction.

	OpenAI Codex	GitHub Copilot	Anthropic Claude Code
Team	Per-token billing	Premium requests (monthly quota + $0.04 overage)	Subscription flat rate + usage-based for third parties only
Individual	Message quota (legacy)	Premium request quota	Subscription flat rate
Billing Granularity	Per token	Request × multiplier	Subscription or API token
Third Parties	Welcomed	Via Copilot Extensions	Outside subscription, API usage-based
Seat Price Trend	$25 → $20 (reduced)	Unchanged	Unchanged (Max $200)

OpenAI has the finest granularity at the token level, Copilot sits in the middle with request × multiplier, and Anthropic has the coarsest with its subscription-plus-API dual structure. Finer granularity means more cost transparency, but makes monthly bills harder to predict.

While OpenAI pitches “same rate for everyone, pay for what you use” transparency, Anthropic’s structure favors its official client and charges third parties separately. Copilot lands in between, using “premium requests” as an easy-to-understand currency that absorbs cost differences across multiple models.

I’d had a feeling they’d wait until people couldn’t write code without AI before messing with the billing. With 2 million weekly users, dropping “we’re scrapping per-message, going per-token now” — yeah, that tracks. Copilot has the same structural weakness, so they’re probably next.