When building a WebRTC voice call, you can't pass a remote MediaStream to the SpeechRecognition API. Here are three workable approaches — remote-side recognition, server-side processing, and AudioContext — plus iOS-specific implementation strategies.
Generalized the scripts from the practice and optimization articles into a reusable framework and published it on GitHub. A walkthrough of how to use it and the design philosophy.
The Web Speech API + Gemini + VOICEVOX setup is complete — an AI character you can actually have a voice conversation with. Key implementation notes and impressions.
Implemented all 12 text processing tools planned in the previous article. Also reorganized the category system and switched the listing UI to a table layout.
Setup notes for Qwen-Image-Edit-2511 on RunPod's RTX 4090 ($0.34/hr) using the ComfyUI template. Includes the fal Multiple-Angles LoRA (4 elevations × 8 azimuths × 3 distances) and a per-image cost breakdown that ends up cheaper than buying a 4090.
Technical prep for automating an implement → review → fix loop with Claude Code and OpenAI Codex via tmux. Can it build something overnight unattended?
An experiment in exchanging WebRTC signaling data via QR codes to achieve P2P voice calls with zero servers. Covers SDP chunking/reassembly and ICE candidate gathering.