Why Codex Security does not emit SAST reports

OpenAI published an article explaining why Codex Security does not emit SAST-style reports. The piece describes how AI-driven constraint reasoning is used to avoid the false-positive problem that traditional SAST tools run into.

What SAST is

SAST, or Static Application Security Testing, analyzes source code or bytecode without running it. It is rule-based and pattern-oriented, and it has been around for a long time.

Common SAST tools include:

Tool	Method	Main target
Semgrep	pattern matching	multi-language
CodeQL	data-flow analysis	multi-language, with GitHub integration
Checkmarx	syntax-tree analysis	multi-language, enterprise
Bandit	rule set	Python
SonarQube	static analysis plus quality metrics	multi-language

The best-known problem with SAST is the number of false positives. Because it mechanically reports code that matches a rule, it often flags code paths that are not actually exploitable. That leads to alert fatigue.

How Codex Security is different

Codex Security does not rely on SAST-style pattern matching. It reads the whole project in context and uses AI-driven constraint reasoning to decide whether something is actually exploitable.

The process has three stages:

flowchart TD
    A[Fetch the whole repository] --> B[Context analysis<br/>understand security structure<br/>build a threat model]
    B --> C[Vulnerability identification<br/>rank by real-world impact]
    C --> D[Sandbox validation<br/>confirm reproducibility and exploitability]
    D --> E[Propose a fix<br/>aligned with system behavior]

Context analysis maps the project’s security structure, such as auth flows, validation paths, and datastore access patterns, and builds a mutable threat model.

Vulnerability identification then ranks issues by real attack impact instead of by pattern match alone.

The biggest difference from old-school SAST is the validation phase. The identified issue is actually exercised in a sandbox to see whether it can be exploited. If it cannot be exploited, it is not reported. That is the core of the false-positive reduction.

Beta results

OpenAI published beta numbers from 1.2 million commits.

Metric	Value
Critical vulnerabilities	792
High-severity vulnerabilities	10,561
False-positive reduction	50%+ across repositories

SAST can look impressive by producing lots of findings, but developers only have so much attention. Codex Security is designed to report only confirmed, real issues instead of raw scan output.

This does not make SAST obsolete

SAST is still useful for immediate feedback and for blocking known-pattern issues early in CI/CD.

What Codex Security is replacing is the habit of treating SAST output as the final report without validating exploitability. It is an answer to the workflow where developers end up triaging thousands of alerts instead of fixing real issues.

What constraint reasoning means

AI-driven constraint reasoning formally represents the data-flow constraints in code and reasons about which attacker-controlled inputs can satisfy those constraints and reach vulnerable paths.

For SQL injection, a SAST tool might say “user input is concatenated into a query string.” A constraint-reasoning system asks whether that input is actually reachable from outside, whether escaping or validation always runs first, and whether the exploit path really exists. If not, it does not report the issue.

Frontier-model reasoning is what gives this validation step its accuracy. The model can handle complex dependencies and context-sensitive conditions that rule-based SAST has trouble expressing.

Availability

Codex Security is available as a research preview for ChatGPT Pro, Enterprise, Business, and Edu users, and it is free for the first month after release.