All Articles - Page 31 | lilting channel

TechMar 27, 20264 min

GitHub Actions 2026 security roadmap

After supply-chain attacks against tj-actions and Trivy, GitHub published a plan to reduce the attack surface of CI/CD pipelines through dependency locking, scoped secrets, and Layer 7 egress firewalls.

GitHub Actions CI CD Supply Chain Security

TechMar 26, 2026updated11 min

Qwen Image Edit on M1 Max went 80s→10min after a ComfyUI update: MPS BF16 is the cause

Diagnosed a 7x speed regression for Qwen Image Edit on M1 Max 64GB ComfyUI after an update. Root cause: MPS BF16 matmul runs ~2x slower than FP16, compounded by an FP16 attention bug. Benchmark numbers and the working fix.

ComfyUI Qwen Apple Silicon MPS PyTorch Experiment

TechMar 26, 20263 min

GTG-1002's Claude Code abuse and GitHub Copilot's AI training policy

A six-phase attack chain showing how the China-linked GTG-1002 group used Claude Code through MCP for autonomous espionage, plus GitHub Copilot's policy change to start using user code for AI training on April 24.

Security Claude GitHub Copilot Threat intelligence Privacy

TechMar 26, 20266 min

ARC-AGI-3 announced, frontier AI in interactive inference less than 1%

François Chollet et al. publish new benchmark ARC-AGI-3. As of March 2026, all Frontier LLMs have achieved less than 1% of the interactive task of autonomously exploring an unknown environment with an unknown goal.

A.I.Benchmark A.G.I.Claude

TechMar 25, 202617 min

Hypura’s NVMe Streaming and TurboQuant’s KV Cache Quantization

Hypura breaks away from llama.cpp’s mmap design and streams even dense models with a three-tier NVMe placement, while TurboQuant eliminates quantization-constant overhead via a polar-coordinate transform. Includes a design comparison with Flash‑MoE and a review of scenarios where KV‑cache compression actually helps.

LLM Local LLM Quantization Apple Silicon Inference Optimization KV Cache Rust

TechMar 25, 2026updated4 min

TeamPCP poisoned the LiteLLM PyPI package and embedded malware that steals more than 50 kinds of credentials

LiteLLM 1.82.7 and 1.82.8 were poisoned on PyPI for about 46 minutes. TeamPCP stole a PyPI token through Trivy's CI/CD and injected malware that collects more than 50 credential types, including SSH keys, AWS, Kubernetes, and Docker secrets.

Security Supply Chain PyPI Malware LLM

TechMar 24, 20267 min

AWS deployment feature added to Claude Code, AI detection added to Code Security on GitHub

AWS releases "Agent Plugins for AWS" for Claude Code/Cursor, automating everything from infrastructure design to deployment. On the same day, GitHub added AI vulnerability detection to Code Security to supplement Shell, Dockerfile, Terraform, and PHP, which are not compatible with CodeQL.

Claude Code AWS MCP GitHub Copilot Vulnerability Detection CodeQL IaC AI Agent

TechMar 24, 20264 min

GPT-5.4 Pro solved a Ramsey hypergraph problem in FrontierMath for the first time, and also pushed Brian-Larson asymptotics

GPT-5.4 Pro became the first model to solve a researcher-level open problem in FrontierMath, a benchmark managed by Epoch AI. Claude Opus 4.6 and Gemini 3.1 Pro later solved it as well.

GPT Claude Gemini FrontierMath AI MachineLearning

TechMar 23, 202611 min

Kana Chat v2 Architecture Changes

Changes from v1 to v2 of Kana Chat, an AI agent built around official CLI wrappers. Covers dual-model router, Heartbeat memory, planner mode, image input, speech transcription, PWA push notifications, and the lessons learned from a month of daily use.

AI Agent Claude Code Codex OpenClaw Gemini tmux FastAPI Tailscale Custom Tool Experiment

TechMar 23, 202611 min

Will NVIDIA's world model Cosmos 2.5 series be included in pet robots?

The Cosmos 2.5 series world model announced by NVIDIA at GTC 2026 is mainly for industrial use, but it has reached the stage where the 2B parameter model can be run on the Jetson Orin Nano, which costs less than $500. We have organized the edge deployment of physical AI, from industrial robots to pet robots.

NVIDIA LLM Robotics Synthetic Data Physical A.I.

TechMar 23, 202611 min

Severe vulnerability in 7% of OpenClaw skills, over 30,000 instances exposed

Composio publishes security analysis of OpenClaw. Approximately 7.1% of SkillHub-distributed skills were found to have critical vulnerabilities, leaving over 30,000 instances exposed to the internet in the early stages at risk of prompt injection and credential theft.

Security AI Agent OpenClaw Prompt Injection LLM

TechMar 23, 20267 min

Flash-MoE: Running a 397B-parameter model on a 48GB MacBook

Flash-MoE is a C/Metal inference engine that runs Qwen3.5-397B-A17B on a MacBook Pro M3 Max at 4.36 tokens/s. With expert streaming from SSD and hand-written Metal shaders, it fits the 209GB model into a 48GB memory budget.

Inference MPS LLM Qwen MoE Local LLM