The Web Speech API + Gemini + VOICEVOX setup is complete — an AI character you can actually have a voice conversation with. Key implementation notes and impressions.
Implemented all 12 text processing tools planned in the previous article. Also reorganized the category system and switched the listing UI to a table layout.
Setup notes for Qwen-Image-Edit-2511 on RunPod's RTX 4090 ($0.34/hr) using the ComfyUI template. Includes the fal Multiple-Angles LoRA (4 elevations × 8 azimuths × 3 distances) and a per-image cost breakdown that ends up cheaper than buying a 4090.
Technical prep for automating an implement → review → fix loop with Claude Code and OpenAI Codex via tmux. Can it build something overnight unattended?
An experiment in exchanging WebRTC signaling data via QR codes to achieve P2P voice calls with zero servers. Covers SDP chunking/reassembly and ICE candidate gathering.