#AI

216 articles

Tech May 17, 2026 9 min

BERT for search and OCR: MLM mechanics, WordPiece, and encoder successors

Why Google added BERT to search in 2019, how MLM training really works (15% mask, 80/10/10, WordPiece), and where encoder-only models still beat LLMs — rerank, classification, and OCR correction.

AI BERT NLP Search Machine Learning Python

Tech May 17, 2026 17 min

Khala open-source song generator: 24GB VRAM, 64-layer RVQ, quality flag live

Khala open-source song generator needs 24GB+ NVIDIA VRAM, ~52GB weights, and still carries a 2026-05-07 quality warning. Notes on the 64-layer RVQ pipeline and generate API.

AI 音楽生成ローカルAI NVIDIA Docker

Tech May 15, 2026 6 min

x-algorithm May 2026: Phoenix pipeline runnable locally with 3GB artifacts

xAI x-algorithm second commit: Phoenix retrieval+ranking runs locally on 537k sports posts with 3GB artifacts. Ad blending and candidate isolation code added since January.

AI GitHub OSS 機械学習 LLM

Tech May 15, 2026 18 min

Anima-Base v1.0 on M1 Max: kana LoRA at 22% light prompt, 67% heavy prompt

Tested on M1 Max 64GB ComfyUI: Anima-Base v1.0 matches preview3-base in speed; WAI-Anima kana LoRA hits 22% on light prompts but 67% with hood+robe+embroidery added.

AI Image Generation ComfyUI Apple Silicon LoRA Anima Qwen Experiment

Tech May 14, 2026 12 min

GTIG observed the first AI-generated zero-day: a Python 2FA bypass exposed by a hallucinated CVSS

Verdict on GTIG's May 11, 2026 report: the first confirmed AI-generated zero-day, a Python 2FA bypass for an OSS admin tool, was caught by a hallucinated CVSS score and textbook Pythonic code structure.

セキュリティ AI LLM ゼロデイ Google

Tech May 14, 2026 29 min

oMLX 0.3.9.dev2 tested on M1 Max 64GB: SSD cache wins, VLM MTP slower

Tested oMLX 0.3.9.dev2 on M1 Max 64GB across 11 scenarios: SSD KV cache cuts Copilot prefill 88s→33s, VLM MTP slows decode 12-30%, omlx launch reaches Copilot/Codex/Claude Code.

AI LLM Local LLM Apple Silicon MLX Inference Optimization Codex 実験

Tech May 13, 2026 updated 8 min

oMLX 0.3.9.dev2 for Mac coding agents: Gemma 4 VLM MTP, DFlash, launch copilot

oMLX 0.3.9.dev2 release notes from the angle of Codex/Copilot on Mac local LLMs: Gemma 4 VLM MTP, DFlash, omlx launch copilot, SSD KV cache — what each changes for agent workflows.

AI LLM Local LLM Apple Silicon MLX Inference Optimization Codex

Tech May 13, 2026 11 min

VoxCPM2 and OSS TTS in 2026: Irodori-TTS, F5-TTS, and Japanese fine-tune notes

VoxCPM2 sits in the tokenizer-free corner. Mapped vs F5-TTS, CosyVoice2, Irodori-TTS, Style-Bert-VITS2; plus why Japanese TTS still leans on OpenJTalk.

AI TTS Speech Synthesis Voice Cloning Local AI Open Source Fine-tuning

Tech May 12, 2026 13 min

WordPress 7.0: wp_ai_client_prompt(), PHP-only blocks, and why RTC was removed

WordPress 7.0 keeps AI Client, Connectors API, PHP-only blocks but drops real-time editing despite 52% storage gain. wp_ai_client_prompt() code and functions.php patterns.

WordPress AI CMS PHP API

Tech May 11, 2026 10 min

Gemini API multimodal File Search as game NPC memory: metadata filters, store tiers, and a cost estimate

Gemini API File Search now indexes images alongside text in the same store. Metadata filters can isolate NPC memories by chapter and character, and a single-character prototype costs under $1/month on Flash-Lite. Notes on tier limits, pricing breakdown, and what to test first.

AI Gemini RAG API Game

Tech May 11, 2026 7 min

Wildfire Evacuation AI Puts Policy Constraints in the Distillation Loss, Not a Post-Processing Filter

A DEV Community article proposes cross-modal distillation for wildfire evacuation routing that encodes road closures and AQI thresholds directly into the loss function. I look at the teacher-student gap when the student drops satellite imagery, why 23ms edge inference is irrelevant if sensor data is 5 minutes old, and what's missing for production.

AI Machine Learning Multimodal Realtime

Tech May 9, 2026 6 min

Fortress Token Optimizer trims 11% off LLM prompts but risks stripping system prompt constraints

Checked Fortress Token Optimizer's DEV article and npm/PyPI packages. Polite filler words shrink 11-22%, but running it blindly on system prompts or RAG context can strip constraints that control model output.

AI LLM API APIコストトークン管理