#AI

184 articles

Tech Feb 8, 2026 5 min

LFM2.5 - a hybrid architecture that's neither Transformer nor Mamba

Liquid AI's LFM2.5 uses a hybrid of short-range convolutions and attention, achieving edge optimization without SSMs. This article covers the architecture, benchmarks, and community use cases.

AI LLM Edge AI Architecture

Tech Feb 8, 2026 updated 6 min

Seedance 2.0 is out—comparing the "ease" of local vs. cloud video generation

ByteDance’s Seedance 2.0 has been released on Dreamina. From the perspective of someone who has been using Wan 2.x and ComfyUI locally, I considered how the "ease" differs between local and cloud-based video generation services.

AI Video Generation Seedance

Tech Feb 7, 2026 6 min

Qwen3-TTS — Open-source speech synthesis with a single pip install

A technical overview of Qwen3‑TTS from Alibaba’s Qwen team: one‑line pip install, 3‑second voice cloning, natural‑language voice design, and support for 10 languages including Japanese. Apache 2.0 licensed.

AI TTS Speech Synthesis Open Source LLM

Tech Feb 6, 2026 6 min

UltraFlux-v1 - a native 4K image generation model based on FLUX.1-dev

Technical details of UltraFlux-v1, a model that pushes FLUX.1-dev into native 4K generation. It covers the differences from Z-Image and FLUX.2 Klein, its RoPE extensions and VAE improvements, and practical caveats.

AI Image Generation FLUX 4K

Tech Feb 6, 2026 6 min

Qwen3-Omni: An omni-modal MoE that unifies text, image, speech, and video with 3B active parameters

A technical walkthrough of Alibaba's Qwen3-Omni-30B-A3B. An omni-modal model that activates only 3B out of 30B and responds with speech from text/image/audio/video inputs. The article organizes the Thinker–Talker architecture, benchmarks, and the overall Qwen3 MoE family.

AI LLM Open Source Multimodal Voice AI

Tech Feb 5, 2026 4 min

UI-TARS-1.5-7B: a vision AI agent that reached SOTA in GUI grounding

A technical look at ByteDance's UI-TARS-1.5-7B, which beats OpenAI CUA and Claude 3.7 by a wide margin at identifying GUI elements from screenshots, and can run locally with a desktop app.

AI LLM Agent Open Source

Tech Feb 4, 2026 5 min

Qwen3-Coder-Next: A Local Coding Agent with 3B Active Parameters

Technical overview of Alibaba’s Qwen3-Coder-Next. An ultra-efficient MoE with 80B parameters but only 3B activated, runs even on a single RTX 4090. Brings 70%+ SWE-Bench performance to local use.

AI LLM Open Source Agent

Tech Feb 4, 2026 3 min

ACE-Step 1.5: AI Music Generation Gets a Full Architecture Overhaul

ACE-Step V1.5 has been released with a hybrid LM+DiT architecture, 50+ language support, and 4GB VRAM minimum — a major evolution from V1.0.

AI Audio Generation Local LLM

Tech Feb 4, 2026 updated 3 min

InfiniteTalk: Audio-Driven Lip Sync Built on Wan 2.1

Published as an official ComfyUI workflow, InfiniteTalk is a lip-sync model specialized in generating mouth animation from audio files. This article covers how it differs from MOVA and Vidu Q3 and what models it requires.

AI Video Generation ComfyUI Lip Sync

Tech Feb 4, 2026 5 min

UI UX Pro Max Skill: Comparing an AI UI-generation skill with my earlier articles

A comparison between the 'UI UX Pro Max Skill' for AI coding assistants such as Claude Code and the UI/UX improvement articles I wrote earlier. Which works better: automatic inference or explicit human intent?

Claude Code AI UI UX

Tech Feb 4, 2026 3 min

AnimeGamer: AI That Understands Game State to Generate Anime Videos

AnimeGamer, developed by Tencent ARC Lab, generates anime-style videos while tracking game-state transitions. It takes a fundamentally different approach from general-purpose video generation models.

AI Video Generation Game Anime

Tech Feb 3, 2026 4 min

MOVA: the first open-source model that generates video and audio together

MOVA-720p from the OpenMOSS team is an open-source model that generates video and audio in a single pass. This article covers how it differs from closed models like Vidu Q3 and what its architecture looks like.

AI Video Generation Audio Generation Open Source