IBM articles | lilting channel

TechFeb 19, 2026updated5 min

How IT-Bench and MAST expose enterprise AI agent failure modes

Using IBM and UC Berkeley's IT-Bench benchmark and the MAST failure taxonomy, this article examines why enterprise AI agents fail. It covers the reality of 11% SRE success and 0% FinOps success, plus the Replit production database deletion incident.

AI AI Agents IBM Benchmark Enterprise

#IBM

How IT-Bench and MAST expose enterprise AI agent failure modes