xAI x-algorithm second commit: Phoenix retrieval+ranking runs locally on 537k sports posts with 3GB artifacts. Ad blending and candidate isolation code added since January.
Verdict on GTIG's May 11, 2026 report: the first confirmed AI-generated zero-day, a Python 2FA bypass for an OSS admin tool, was caught by a hallucinated CVSS score and textbook Pythonic code structure.
oMLX 0.3.9.dev2 release notes from the angle of Codex/Copilot on Mac local LLMs: Gemma 4 VLM MTP, DFlash, omlx launch copilot, SSD KV cache — what each changes for agent workflows.
Out-of-bounds read in Ollama's GGUF loader before 0.17.1. If your Ollama API is network-accessible, a crafted model file can exfiltrate env vars, API keys, system prompts, and conversation fragments from process memory.
Checked Fortress Token Optimizer's DEV article and npm/PyPI packages. Polite filler words shrink 11-22%, but running it blindly on system prompts or RAG context can strip constraints that control model output.
Tested Gemma 4 MTP drafter on M1 Max 64GB with mlx-vlm 0.5.0. Only the 26B A4B MoE got +13%; 31B Dense and E4B got slower. Code gen vs short haiku prompts flip the result.
Oxford Internet Institute's Nature 2026 paper found warmth fine-tuning raised error rates 10-30 points when users held wrong beliefs. Shah et al. showed Pearson r = 0.87 between persona agreeableness and sycophancy across 13 open-weight models. Standard benchmarks caught neither effect.
Reading Google's MTP drafter docs, vLLM recipes, and the AI for Developers guide. The 3x claim holds for 31B Dense but 26B A4B MoE stalls at batch 1 because speculative decoding verification loads extra expert weights per candidate token.
Tested connecting MCP servers to Ollama local LLMs on M1 Max 64GB. MCPHost is deprecated, tool calling breaks with quantized models, and context fills fast. Includes working TypeScript and Python custom MCP server setups.
Starting from Claude Code's 1.67B token runaway (anthropics/claude-code#4095), this traces why tool responses need is_complete, retryable: false, duplicate detection, and orchestrator-level budget caps. Directly applicable to MCP server design.