ByteDance’s Seedance 2.0 has been released on Dreamina. From the perspective of someone who has been using Wan 2.x and ComfyUI locally, I considered how the "ease" differs between local and cloud-based video generation services.
A technical overview of Qwen3‑TTS from Alibaba’s Qwen team: one‑line pip install, 3‑second voice cloning, natural‑language voice design, and support for 10 languages including Japanese. Apache 2.0 licensed.
A technical walkthrough of Alibaba's Qwen3-Omni-30B-A3B. An omni-modal model that activates only 3B out of 30B and responds with speech from text/image/audio/video inputs. The article organizes the Thinker–Talker architecture, benchmarks, and the overall Qwen3 MoE family.
Technical overview of Alibaba’s Qwen3-Coder-Next. An ultra-efficient MoE with 80B parameters but only 3B activated, runs even on a single RTX 4090. Brings 70%+ SWE-Bench performance to local use.
A comparison between the 'UI UX Pro Max Skill' for AI coding assistants such as Claude Code and the UI/UX improvement articles I wrote earlier. Which works better: automatic inference or explicit human intent?
Overview of PersonaPlex‑7B‑v1 released by NVIDIA in January 2026. A Moshi‑based voice dialog model that enables full‑duplex conversation and persona control.
Overview of Black Forest Labs' FLUX.2 Klein 9B model and how it performs on M1/M2/M3/M4 Macs. Covers the key factors behind the CUDA vs MPS performance gap, including memory bandwidth and FP8 quantization.
This article organizes the major video-generation AI updates announced in January 2026 and examines whether i2v (image→video) is practically usable, including models that run locally.
An overview of Kimi K2.5’s technical highlights from Moonshot AI: a 1T-parameter MoE architecture, the MoonViT vision encoder, Agent Swarm (PARL), benchmark results, and more.