#LLM

99 articles

Tech Mar 21, 2026 4 min

LangFlow's CVE-2026-33017 was already being exploited within 20 hours of disclosure

A CVSS 9.3 unauthenticated RCE in LangFlow was confirmed in real-world attacks within 20 hours of public disclosure.

Langflow CVE Security RCE LLM

Tech Mar 18, 2026 4 min

Holotron-12B Makes PC-Operation AI 1.7× Faster, and Unsloth Studio Lets You Tune Models Without Code

H Company's Holotron-12B uses a memory-efficient new design to lift PC-operation AI throughput to 8,900 tokens per second. Unsloth has released the beta of 'Studio,' a browser tool for no-code model fine-tuning.

AI LLM AI Agents Unsloth Local LLM

Tech Mar 17, 2026 3 min

Merriam-Webster and Britannica sue OpenAI over copyright infringement

Two long-standing reference publishers sued OpenAI, alleging that roughly 100,000 articles were used without permission to train LLMs.

OpenAI Copyright AI LLM law

Tech Mar 14, 2026 7 min

Cloudflare ships WAF security for AI apps and RFC 9457-style error responses on the same day

AI Security for Apps reached GA, letting Cloudflare block prompt injection and PII leaks at the WAF layer. On the same day, it also launched RFC 9457-compatible error responses that replace HTML with JSON or Markdown when AI agents hit Cloudflare errors.

Cloudflare Security Prompt Injection WAF AI Agent RFC LLM

Tech Mar 14, 2026 updated 7 min

Claude’s 1M context window is now GA, integrated into the standard API at no extra cost

Anthropic has GA’d a 1M‑token context window. No surcharge for long context; image/PDF per‑request limit raised from 100 to 600. Achieved a frontier‑model best score on MRCR v2.

Claude Anthropic LLM AI

Tech Mar 11, 2026 7 min

Design patterns for LLM asynchronous training seen in 16 open source RL libraries

HuggingFace conducts a comparative analysis of 16 open source RL training libraries based on 7 design axes. In the synchronous type, the GPU utilization remains at around 60% due to the generation bottleneck, but with an asynchronous separation design it can be improved to over 95%.

A.I. Machine Learning Reinforcement Learning LLM

Tech Mar 10, 2026 7 min

Sarvam 30B/105B: India’s first open‑source LLM built end‑to‑end domestically

Sarvam AI released 30B and 105B models trained entirely in India—from pretraining through RL—featuring support for 22 constitutionally recognized Indian languages and inference optimizations.

LLM OSS AI MachineLearning

Tech Mar 10, 2026 7 min

Karpathy's Autoresearch lets AI run 100 ML experiments while you sleep

Andrej Karpathy released Autoresearch, a system where an AI agent autonomously runs machine-learning experiments on a GPU and tries 100 variants overnight. The article breaks down the mechanism and design so even readers with zero ML background can follow.

AI MachineLearning LLM AI Agents OSS

Tech Mar 6, 2026 10 min

Back-to-back releases of OpenAI GPT-5.3/5.4 and Saguaro-driven inference speedups

A summary of GPT-5.3 Instant’s hallucination reductions and safety regressions, GPT-5.4’s computer use, Tool Search, and 1M-token context, plus Saguaro’s 5× inference speedups.

LLM OpenAI GPT Inference Optimization Speculative Decoding AI Safety Computer Use

Tech Mar 4, 2026 7 min

Amazon Bedrock Mantle's OpenAI-compatible API is now generally available

AWS has made OpenAI API compatibility for the Bedrock Mantle distributed inference engine generally available, letting existing OpenAI SDK code run against open-weight models such as DeepSeek and Mistral.

AWS Amazon Bedrock OpenAI API LLM

Tech Mar 1, 2026 11 min

The Reason Qwen 3.5 Failed on Radeon 8060S Was an Outdated AMD Driver

Isolating the cause of Qwen 3.5 failing on ROCm/Vulkan via CPU inference, llama-server, and LM Studio — an AMD driver update resolved everything.

AI LLM Local LLM AMD llama.cpp Ollama LM Studio Experiment

Tech Feb 28, 2026 updated 12 min

Abliterated Models in Ollama Were a Complete Failure — and the Official Version Was Fine All Along

All variants of huihui-ai's Qwen 3.5 abliterated produced garbage tokens. GLM-4.7-Flash abliterated had a broken chat template. The official version with thinking disabled turned out to be the right answer.

AI LLM Ollama Local LLM AMD LM Studio Vulkan ROCm Experiment