Codex 'Selected model is at capacity': serving capacity, not context length, and the thread resumes on continue

TL;DR

Symptom Codex prints Selected model is at capacity. Please try a different model. and stalls (capacity means the model’s serving slots, not your context window)

Workaround Interactively, don’t switch models; tell it to continue in the same thread and it usually resumes

Limitation In unattended or loop runs (overnight autonomous jobs) the manual continue doesn’t apply. There’s no auto-retry yet; a request for retry-with-backoff and retained state is open at #22390

While Codex was working, it printed Selected model is at capacity. Please try a different model.
This was almost my first time hitting it. Context compaction had not run, and the thread itself did not look broken.

I sent this in the same thread.

Keep going without stopping.

Codex then continued as if nothing had happened.
As of this write-up, I have not hit the same error again during that resumed run.

That behavior makes it too early to throw away the thread just from this message.
In my case, one model call was rejected, but the task state and conversation context remained intact.

Capacity is not context capacity

The capacity in this message reads as model-side serving capacity, not context length or context compaction.
In this case, context compaction had not run.

In a nearby GitHub issue, an OpenAI maintainer explains that this is not an account-specific rate limit, but capacity pressure on the selected model.
The relevant comment is in openai/codex #17014.

So I read capacity here as:

No available serving slot for the selected model

The word “capacity” can make it sound like context limit or remaining tokens.
For this error, it is better treated as a separate model-serving condition. If Codex still has the thread state, the next successful request can resume the work.

Closed issues and open issues are mixed together

The issue state around this error has to be read in separate buckets.
The same message appears across short incident reports, stale banner reports, and retry-mechanism requests.

These are the states I saw on June 11, 2026 JST.

Issue	State	Notes
#17014	closed	Maintainer explained it as model capacity, not an account rate limit
#22277	closed	Treated as a May 12, 2026 incident and later commented as mitigated
#11635	open	Stale capacity banner while the model keeps responding
#22390	open	Request for backoff retry and task-state retention on transient capacity errors
#27149	open	June 9, 2026 report around gpt-5.5 capacity errors and recovered-session context left

“The May 12 incident was mitigated” and “retry/state retention for capacity errors is still open as a product request” can both be true.
My local “continue prompt resumed the task” case sits closer to the stale-banner thread in #11635 and the state-retention request in #22390.

Continue in the same thread first

The first move is to keep the thread open.
If Codex shows this error mid-task and the task state is still present, the next user input can resume it.

In my case, a short forceful continue prompt was enough.

Keep going without stopping.

A more explicit version would be:

Continue the previous task. Check the current state first, then resume from the unfinished step.

If that does not go through, then model switching is the next thing to try.
In Codex CLI, use /model inside a thread, or start a new session with codex -m <model>. Codex model selection is also documented in Codex Models.

Changing models mid-task can slightly change the judgment style of the output.
For work where quality matters, I would first retry on the same model once, then switch only if it still does not pass.

Unattended runs are still a separate question

Both Codex and Claude Code can now handle larger contexts.
They can also take on more complex automated work than before.

But errors like this capacity failure, or the tool-call breakage I wrote about in Claude Code ‘court’ bug: tool calls leak as text instead of running, make unattended completion a separate question.

Large context and long unattended reliability are not the same property.
If one user message saying “continue” is enough to recover, that is useful, but the run is no longer fully unattended.

For long tasks, the current practical pattern is still to split work into smaller chunks, leave progress in files or git diff, and give the agent an easy way to resume after a stop.

Current observation

This article only covers one local observation and what is visible in public issues.

In my case, context compaction had not run.
After Selected model is at capacity. Please try a different model., sending a continue prompt in the same thread resumed the task.
I have not yet seen the same error repeat during the resumed run.

My current operating note is:

Do not discard the thread immediately.
Send a continue prompt in the same thread.
Switch models only if that does not pass.
If it repeats, record the issue number, timestamp, model, and whether context compaction had run.

The error message sounds severe.
At least in this run, it did not mean the task was over.

Update (2026-07-01)

The workaround here, telling it to continue in the same thread, assumes someone is watching the screen. In long autonomous loops or overnight unattended runs you can’t do that manual step, so a capacity error just leaves the task stalled.
Codex still has no automatic retry for capacity errors. #22390 requests exactly that: keep a long-running task alive overnight and handle transient capacity with automatic retry plus retained state. It’s open as of July 1. Today you only get the “try a different model” message, and the user ends up being the retry loop.
#22277 notes there’s no pre-flight server-health check, so tasks can crash mid-pipeline. If you run Codex unattended, wrap it with your own capacity-error detection and retry/resume logic.