Stripe Minions, Amazon Kiro, Claude Code compaction, and a Replit DB deletion. We synthesize multiple cases to extract the design principles required to operate AI coding agents in production, and organize them alongside CodeRabbit's 470‑repo statistics plus efforts from Google and GitHub.
Andrej Karpathy coined "Claws" as an upper layer for AI agents, and June Kim answered the same question from a different angle with the Cord framework implemented with MCP and SQLite. This piece organizes the shift from single-shot agents to autonomous coordination systems from both conceptual and implementation perspectives.
Kiro autonomously deleted production, causing 13 hours of AWS downtime; Claude Code’s auto-compaction irreversibly erases context; sub-agents silently burn through usage. Three incident reports from the same week.
Stripe’s Minions agent generates 1,300+ PRs per week with zero human effort. Implementation details of the four components: Devbox, Blueprints, Toolshed, and a fork of goose.
Using IBM and UC Berkeley's IT-Bench benchmark and the MAST failure taxonomy, this article examines why enterprise AI agents fail. It covers the reality of 11% SRE success and 0% FinOps success, plus the Replit production database deletion incident.