#OpenAI

26 articles

Tech Mar 12, 2026 15 min

Prompt injection countermeasures using GitHub's agent execution platform and OpenAI IH-Challenge

GitHub releases the layered defense design of the agent execution platform, and OpenAI releases the instruction hierarchy training data IH-Challenge and model. Responses to prompt injection were received from both infrastructure design and training axes.

A.I. Security GitHub OpenAI AI Agent LLM Safety

Tech Mar 10, 2026 5 min

OpenAI's Promptfoo acquisition and Microsoft's shift to a multimodel stack

OpenAI acquired AI security evaluation platform Promptfoo, and Microsoft announced that Anthropic's Claude Cowork would be integrated into Microsoft 365 Copilot. The structure of the enterprise AI market is starting to change.

OpenAI Microsoft Anthropic Claude Security Copilot AI AI Agents

Tech Mar 10, 2026 8 min

Local isolation execution of AI agent, what is the difference between macOS sandbox-exec and Windows sandbox

Two approaches to achieve local isolated execution of AI coding agents. On macOS, Agent Safehouse uses OS-native sandbox-exec for kernel-level restrictions, and on Windows, Codex uses the VM-based Windows sandbox.

AI Agent Security macOS Windows Claude Code OpenAI

Tech Mar 6, 2026 10 min

Back-to-back releases of OpenAI GPT-5.3/5.4 and Saguaro-driven inference speedups

A summary of GPT-5.3 Instant’s hallucination reductions and safety regressions, GPT-5.4’s computer use, Tool Search, and 1M-token context, plus Saguaro’s 5× inference speedups.

LLM OpenAI GPT Inference Optimization Speculative Decoding AI Safety Computer Use

Tech Mar 4, 2026 7 min

Amazon Bedrock Mantle's OpenAI-compatible API is now generally available

AWS has made OpenAI API compatibility for the Bedrock Mantle distributed inference engine generally available, letting existing OpenAI SDK code run against open-weight models such as DeepSeek and Mistral.

AWS Amazon Bedrock OpenAI API LLM

Tech Feb 28, 2026 7 min

GitHub Copilot Coding Agent Major Updates and Figma-Codex MCP Integration

Five new features for Copilot coding agent — model selection, self-review, security scanning, custom agents, and CLI integration — plus bidirectional Figma-Codex integration via MCP. Also covers Copilot CLI GA and comparison with Claude Code Figma integration.

GitHub Copilot Figma OpenAI Codex MCP AI Coding

Tech Feb 24, 2026 8 min

Large-Scale Unauthorized Distillation of Claude and the Collapse of SWE-bench Hit on the Same Day

Anthropic accused three Chinese AI companies of distilling Claude, and on the same day OpenAI retired SWE-bench Verified. Training fraud and evaluation flaws exposed simultaneously on February 23, 2026.

AI Security Anthropic DeepSeek Benchmark LLM OpenAI SWE-bench

Tech Feb 24, 2026 updated 7 min

Injection Attacks on AI Agent Memory and Automated Smart Contract Exploitation with EVMbench

Techniques and defenses from the MINJA, InjecMEM, and ToxicSkills campaigns that poison AI agents’ memory files, and the fact that GPT-5.3-Codex achieved a 72% exploit success rate on EVMbench released by OpenAI and Paradigm. This article organizes how AI becomes both a target of attacks and a weapon for attackers.

Security AI Agents Prompt Injection MCP Ethereum Smart Contracts OpenAI Supply Chain

Tech Jan 19, 2026 3 min

Released the Claude Code + Codex Auto-Dev Framework on GitHub

Generalized the scripts from the practice and optimization articles into a reusable framework and published it on GitHub. A walkthrough of how to use it and the design philosophy.

Claude Code OpenAI Codex tmux AI Automation Experiment

Tech Jan 17, 2026 6 min

Letting Claude Code and Codex Run Overnight in tmux (Optimization)

Design patterns for reducing context usage and API calls in the AI auto-dev loop: blocking waits, read-forbidden files, and session isolation.

Claude Code OpenAI Codex tmux AI Automation Experiment

Tech Jan 15, 2026 9 min

Letting Claude Code and Codex Run Overnight in tmux (Practice)

Running the Claude Code + Codex auto-loop for real. Generated 1,134 lines of game code.

Claude Code OpenAI Codex tmux AI Automation Experiment

Tech Jan 14, 2026 5 min

Letting Claude Code and Codex Run Overnight in tmux to Build a Game (Setup)

Technical prep for automating an implement → review → fix loop with Claude Code and OpenAI Codex via tmux. Can it build something overnight unattended?

Claude Code OpenAI Codex tmux AI Automation Experiment