A technical walkthrough of Alibaba's Qwen3-Omni-30B-A3B. An omni-modal model that activates only 3B out of 30B and responds with speech from text/image/audio/video inputs. The article organizes the Thinker–Talker architecture, benchmarks, and the overall Qwen3 MoE family.
Technical overview of Alibaba’s Qwen3-Coder-Next. An ultra-efficient MoE with 80B parameters but only 3B activated, runs even on a single RTX 4090. Brings 70%+ SWE-Bench performance to local use.
An overview of Kimi K2.5’s technical highlights from Moonshot AI: a 1T-parameter MoE architecture, the MoonViT vision encoder, Agent Swarm (PARL), benchmark results, and more.
A plan to build an internal help desk RAG system using a Mac mini M4 Pro and Dify. Highlights what's new in Dify circa 2025 and tips for running local LLMs.
Introducing a Mac mini M4 Pro to build an in-house RAG system. A plan for setting up a LoRA training environment during downtime while waiting for specs to be finalized.