Agentic pipeline, which combines ReACT loop and searcher, achieved 1st place in ViDoRe v3 and 2nd place in BRIGHT. We established versatility for multiple domains using the same pipeline.
Anthropic has GA’d a 1M‑token context window. No surcharge for long context; image/PDF per‑request limit raised from 100 to 600. Achieved a frontier‑model best score on MRCR v2.
Anthropic's new multi-agent code review feature for Claude Code, plus the design split between subagents and orchestration. Also covers the major frameworks and where Codex fits in.
OpenAI acquired AI security evaluation platform Promptfoo, and Microsoft announced that Anthropic's Claude Cowork would be integrated into Microsoft 365 Copilot. The structure of the enterprise AI market is starting to change.
Sarvam AI released 30B and 105B models trained entirely in India—from pretraining through RL—featuring support for 22 constitutionally recognized Indian languages and inference optimizations.
Andrej Karpathy released Autoresearch, a system where an AI agent autonomously runs machine-learning experiments on a GPU and tries 100 variants overnight. The article breaks down the mechanism and design so even readers with zero ML background can follow.
Anthropic found 22 CVEs in Firefox's JS engine with Claude, while GitHub Security Lab reported more than 80 vulnerabilities in apps built on the OSS framework Taskflow Agent.
Attempting WAN 2.2 I2V video generation on Windows with RTX 4060 8GB VRAM. The 5B fp8 model had rough quality; the 14B Rapid distilled model with lowvram offloading was the practical solution.
JPEG-XL revival in Chrome 145 and how to use cjxl, RSA → Elliptic Curve → PQC cryptography transition and Merkle Tree Certificates, WebMCP implementation examples, Chrome zero-day trends, and customizable select elements.
Running LTX-2 and Wan 2.2 on an M1 Max 64GB. FP8 doesn't work on Metal, bypassed with GGUF. Wan 2.2 takes 82 minutes for a 2-second video. LTX-2's official pipeline produces NaN on MPS, and the KSampler fallback doesn't reach usable quality.
All variants of huihui-ai's Qwen 3.5 abliterated produced garbage tokens. GLM-4.7-Flash abliterated had a broken chat template. The official version with thinking disabled turned out to be the right answer.