OpenAI Codex Moves from Message-Based Credits to Token-Based Pricing
Contents
On April 2, 2026, OpenAI revised the pricing structure for Codex. For ChatGPT Business and Enterprise customers, the old “N credits per message” system was retired in favor of credit billing tied to API token consumption. Details are available on the official rate card.
At the same time, ChatGPT Business seat pricing was dropped from $25 to $20, and a new Codex-only seat was introduced. The Codex-only seat is fully usage-based with no rate limits — you pay credits for exactly what you use. New workspaces also get promotional credits of $100 per user, up to $500 (expires April 30).
The Old System Was Structurally Identical to GitHub Copilot’s “Premium Requests”
Looking back at Codex’s old pricing, it was nearly identical in structure to GitHub Copilot’s current billing model.
Copilot meters AI model usage through “premium requests.” Each plan has a monthly cap, and each model has a “multiplier.” More powerful models consume more of your quota per request.
| Copilot Plan | Monthly Price | Premium Requests/Month |
|---|---|---|
| Free | $0 | 50 |
| Pro | $10 | 300 |
| Pro+ | $39 | 1,500 |
| Business | $19/user | 300/user |
| Enterprise | $39/user | 1,000/user |
Model-specific multipliers look like this:
| Model | Multiplier |
|---|---|
| GPT-4.1, GPT-4o, GPT-5 mini | 0 (free, unlimited) |
| Claude Haiku 4.5, Gemini 3 Flash, GPT-5.4 mini | 0.33 |
| Claude Sonnet 4.6, Gemini 3.1 Pro, GPT-5.4 | 1 |
| Claude Opus 4.6 | 3 |
| Claude Opus 4.6 (Fast Mode) | 30 |
With Copilot Pro’s 300-request monthly quota, you can make 300 calls with Claude Sonnet 4.6 (1x), 100 with Claude Opus 4.6 (3x), or unlimited with GPT-4.1 (0x). Going over the cap incurs $0.04 per additional request.
graph TD
A[Subscription<br/>Fixed monthly fee] --> B[Premium Requests<br/>N requests/month]
B --> C[GPT-4.1<br/>0x = free]
B --> D[Claude Sonnet<br/>1x]
B --> E[Claude Opus<br/>3x]
B --> F[Over quota<br/>\$0.04/request]
Codex’s old system followed the same pattern — a credit quota bundled with the monthly subscription, with consumption varying by model and task type.
| Codex (Old) | Local Task | Cloud Task | Code Review |
|---|---|---|---|
| GPT-5.4 | ~7 credits | ~34 credits | ~34 credits |
| GPT-5.3-Codex | ~5 credits | ~25 credits | ~25 credits |
| GPT-5.1-Codex-Mini | ~1 credit | N/A | N/A |
The three-layer structure of “fixed monthly fee + model-based usage quota + overage charges” was exactly the same for both Copilot and old Codex. What Copilot calls “premium requests × multiplier,” Codex called “messages × estimated credits.”
Why Per-Message Billing Broke Down
The system broke down for Codex because coding task sizes vary too wildly.
A 10-line bug fix and a full-repo refactor consume orders-of-magnitude different token counts. Under message-based billing, both counted as “1 message” — users running lots of small tasks overpaid, while users throwing large tasks at Codex underpaid. The cloud task multiplier (estimated at 5–7x the local rate) was also approximate, making it hard to predict actual costs.
GitHub Copilot’s premium requests have the same structural weakness. In Copilot’s agent mode (Coding Agent), one session counts as “1 premium request × multiplier,” but tool calls the agent makes autonomously within that session aren’t counted. A session that finishes in 10 steps costs the same as one that takes 100 steps. As usage grows and diversifies, this granularity gap distorts the cost structure.
Once Codex hit 2 million weekly users, OpenAI could no longer ignore the distortion.
The New Rate Card
Under the new system, credit consumption is defined per million tokens for each model.
| Model | Input | Cached Input | Output |
|---|---|---|---|
| GPT-5.4 | 62.50 | 6.250 | 375 |
| GPT-5.3-Codex | 43.75 | 4.375 | 350 |
| GPT-5.4-Mini | 18.75 | 1.875 | 113 |
| GPT-5.1-Codex-mini | 6.25 | 0.625 | 50 |
Units are credits per 1M tokens. Fast Mode doubles the credit consumption.
The cached input discount stands out — it’s 1/10 the regular input rate. Codex repeatedly sends context for the same repository, so cache hit rates tend to be high. Prompt caching reuses the portions of input (system prompts, repo context) that overlap with the previous request on the server side, and it kicks in more the longer you work on the same project. When this discount applies, costs can drop significantly compared to the message-based era.
On the other hand, GPT-5.4’s output tokens cost 375 credits per million, which is steep. Tasks that demand heavy code generation will see costs spike. It’s a design where your mileage varies by usage pattern.
Comparison with Direct API Usage
Codex can be used via ChatGPT authentication or directly with an API key. With an API key, billing is in dollars per token rather than credits.
| Model | API Input ($/1M) | API Cached Input ($/1M) | API Output ($/1M) |
|---|---|---|---|
| gpt-5.1-codex-mini | $0.25 | $0.025 | $2.00 |
| gpt-5.3-codex | $1.75 | $0.175 | $14.00 |
| gpt-5.4 | $2.50 | $0.25 | $15.00 |
API key usage is simpler in terms of rates, but you can’t use the quota bundled with ChatGPT plans (Plus $20, Pro $200). For teams, going through Business makes sense for centralized management, but individual developers with predictable usage might prefer direct API access.
Per-Plan Quotas (Legacy)
Plus, Pro, and existing Enterprise/Edu customers remain on the legacy message-based system. No timeline has been announced for migrating them to the new rate card.
| Plan | GPT-5.4 Messages | GPT-5.3-Codex Messages | Window |
|---|---|---|---|
| Plus | 33–168 | 45–225 | 5 hours |
| Pro | 223–1,120 | 300–1,500 | 5 hours (6x Plus) |
| Business | 15–60 | 20–90 | 5 hours |
The ranges exist because consumption varies by task type (local/cloud). Enterprise/Edu have no fixed rate limits.
Business has the lowest message counts because it has already migrated to token-based billing under the new system. The legacy Business quota doesn’t apply to new customers.
Codex and Copilot: Diverging Billing Models
OpenAI and GitHub (Microsoft) have a tight partnership. Copilot was originally built on the first-generation OpenAI Codex model (a GPT-3-family code generation model, distinct from the current coding agent). GPT-5-family models still power Copilot’s backend. Both products had adopted the same “message/request-based credit” structure for billing.
That changed with this pricing update.
graph TD
A[Shared Origin<br/>Message-based credits] --> B[OpenAI Codex<br/>April 2026 revision]
A --> C[GitHub Copilot<br/>As of April 2026]
B --> D[Per-token billing<br/>Proportional to resources consumed]
C --> E[Premium requests retained<br/>Monthly quota + multiplier]
The divergence reflects differences in the products themselves.
Codex operates as an autonomous coding agent in the cloud, working across entire repositories. A single session might consume tens of thousands of tokens or just a few hundred. This variance is what made per-message billing untenable.
Copilot’s core use case is inline code completion and chat in the IDE. Token consumption per interaction doesn’t swing as wildly as Codex. The premium request system works well enough, and “300 requests/month” or “1,500 requests/month” gives users a clearer picture of their costs.
That said, Copilot keeps adding agent features — Coding Agent, parallel agents via /fleet, and more. As agents that run autonomously for extended periods become more common, Copilot will face the same problem. A move to token-based billing is a real possibility.
Three-Way Billing Comparison
A few days later on April 4, Anthropic took the opposite approach — locking third-party harnesses out of Claude Code subscriptions. They restricted subscription-based access and pushed third parties toward API usage-based billing. Lining up all three reveals distinct approaches to billing granularity and direction.
| OpenAI Codex | GitHub Copilot | Anthropic Claude Code | |
|---|---|---|---|
| Team | Per-token billing | Premium requests (monthly quota + $0.04 overage) | Subscription flat rate + usage-based for third parties only |
| Individual | Message quota (legacy) | Premium request quota | Subscription flat rate |
| Billing Granularity | Per token | Request × multiplier | Subscription or API token |
| Third Parties | Welcomed | Via Copilot Extensions | Outside subscription, API usage-based |
| Seat Price Trend | $25 → $20 (reduced) | Unchanged | Unchanged (Max $200) |
OpenAI has the finest granularity at the token level, Copilot sits in the middle with request × multiplier, and Anthropic has the coarsest with its subscription-plus-API dual structure. Finer granularity means more cost transparency, but makes monthly bills harder to predict.