After supply-chain attacks against tj-actions and Trivy, GitHub published a plan to reduce the attack surface of CI/CD pipelines through dependency locking, scoped secrets, and Layer 7 egress firewalls.
Diagnosed a 7x speed regression for Qwen Image Edit on M1 Max 64GB ComfyUI after an update. Root cause: MPS BF16 matmul runs ~2x slower than FP16, compounded by an FP16 attention bug. Benchmark numbers and the working fix.
A six-phase attack chain showing how the China-linked GTG-1002 group used Claude Code through MCP for autonomous espionage, plus GitHub Copilot's policy change to start using user code for AI training on April 24.
François Chollet et al. publish new benchmark ARC-AGI-3. As of March 2026, all Frontier LLMs have achieved less than 1% of the interactive task of autonomously exploring an unknown environment with an unknown goal.
Hypura breaks away from llama.cpp’s mmap design and streams even dense models with a three-tier NVMe placement, while TurboQuant eliminates quantization-constant overhead via a polar-coordinate transform. Includes a design comparison with Flash‑MoE and a review of scenarios where KV‑cache compression actually helps.
LiteLLM 1.82.7 and 1.82.8 were poisoned on PyPI for about 46 minutes. TeamPCP stole a PyPI token through Trivy's CI/CD and injected malware that collects more than 50 credential types, including SSH keys, AWS, Kubernetes, and Docker secrets.
AWS releases "Agent Plugins for AWS" for Claude Code/Cursor, automating everything from infrastructure design to deployment. On the same day, GitHub added AI vulnerability detection to Code Security to supplement Shell, Dockerfile, Terraform, and PHP, which are not compatible with CodeQL.
GPT-5.4 Pro became the first model to solve a researcher-level open problem in FrontierMath, a benchmark managed by Epoch AI. Claude Opus 4.6 and Gemini 3.1 Pro later solved it as well.
Changes from v1 to v2 of Kana Chat, an AI agent built around official CLI wrappers. Covers dual-model router, Heartbeat memory, planner mode, image input, speech transcription, PWA push notifications, and the lessons learned from a month of daily use.
The Cosmos 2.5 series world model announced by NVIDIA at GTC 2026 is mainly for industrial use, but it has reached the stage where the 2B parameter model can be run on the Jetson Orin Nano, which costs less than $500. We have organized the edge deployment of physical AI, from industrial robots to pet robots.
Composio publishes security analysis of OpenClaw. Approximately 7.1% of SkillHub-distributed skills were found to have critical vulnerabilities, leaving over 30,000 instances exposed to the internet in the early stages at risk of prompt injection and credential theft.
Flash-MoE is a C/Metal inference engine that runs Qwen3.5-397B-A17B on a MacBook Pro M3 Max at 4.36 tokens/s. With expert streaming from SSD and hand-written Metal shaders, it fits the 209GB model into a 48GB memory budget.