Tech - Page 22 | lilting channel

Tech Feb 7, 2026 6 min

Qwen3-TTS — Open-source speech synthesis with a single pip install

A technical overview of Qwen3‑TTS from Alibaba’s Qwen team: one‑line pip install, 3‑second voice cloning, natural‑language voice design, and support for 10 languages including Japanese. Apache 2.0 licensed.

AI TTS Speech Synthesis Open Source LLM

Tech Feb 6, 2026 6 min

UltraFlux-v1 - a native 4K image generation model based on FLUX.1-dev

Technical details of UltraFlux-v1, a model that pushes FLUX.1-dev into native 4K generation. It covers the differences from Z-Image and FLUX.2 Klein, its RoPE extensions and VAE improvements, and practical caveats.

AI Image Generation FLUX 4K

Tech Feb 6, 2026 6 min

Qwen3-Omni: An omni-modal MoE that unifies text, image, speech, and video with 3B active parameters

A technical walkthrough of Alibaba's Qwen3-Omni-30B-A3B. An omni-modal model that activates only 3B out of 30B and responds with speech from text/image/audio/video inputs. The article organizes the Thinker–Talker architecture, benchmarks, and the overall Qwen3 MoE family.

AI LLM Open Source Multimodal Voice AI

Tech Feb 6, 2026 5 min

Anima — A 2B Anime Image Generation Model Based on Cosmos-Predict2: Current State and Issues

Anima is an anime-focused image generation model co-developed by CircleStone Labs and Comfy Org. Built on a new architecture, the preview release draws attention — but how does it actually perform? A look at its strengths, weaknesses, and comparison with existing SDXL-based models.

Image Generation AI ComfyUI Anime

Tech Feb 5, 2026 4 min

UI-TARS-1.5-7B: a vision AI agent that reached SOTA in GUI grounding

A technical look at ByteDance's UI-TARS-1.5-7B, which beats OpenAI CUA and Claude 3.7 by a wide margin at identifying GUI elements from screenshots, and can run locally with a desktop app.

AI LLM Agent Open Source

Tech Feb 4, 2026 5 min

Qwen3-Coder-Next: A Local Coding Agent with 3B Active Parameters

Technical overview of Alibaba’s Qwen3-Coder-Next. An ultra-efficient MoE with 80B parameters but only 3B activated, runs even on a single RTX 4090. Brings 70%+ SWE-Bench performance to local use.

AI LLM Open Source Agent

Tech Feb 4, 2026 3 min

ACE-Step 1.5: AI Music Generation Gets a Full Architecture Overhaul

ACE-Step V1.5 has been released with a hybrid LM+DiT architecture, 50+ language support, and 4GB VRAM minimum — a major evolution from V1.0.

AI Audio Generation Local LLM

Tech Feb 4, 2026 3 min

InfiniteTalk: Audio-Driven Lip Sync Built on Wan 2.1

Published as an official ComfyUI workflow, InfiniteTalk is a lip-sync model specialized in generating mouth animation from audio files. This article covers how it differs from MOVA and Vidu Q3 and what models it requires.

AI Video Generation ComfyUI Lip Sync

Tech Feb 4, 2026 5 min

UI UX Pro Max Skill: Comparing an AI UI-generation skill with my earlier articles

A comparison between the 'UI UX Pro Max Skill' for AI coding assistants such as Claude Code and the UI/UX improvement articles I wrote earlier. Which works better: automatic inference or explicit human intent?

Claude Code AI UI UX

Tech Feb 4, 2026 3 min

AnimeGamer: AI That Understands Game State to Generate Anime Videos

AnimeGamer, developed by Tencent ARC Lab, generates anime-style videos while tracking game-state transitions. It takes a fundamentally different approach from general-purpose video generation models.

AI Video Generation Game Anime

Tech Feb 4, 2026 2 min

A Unified View of Attention Sinks and Residual Sinks: LLM 'Outliers' as a Training-Stability Mechanism

A paper explains that two seemingly mysterious Transformer behaviors, heavy attention on specific tokens and unusually large activations in specific dimensions, are actually manifestations of the same mechanism.

LLM Transformer Research

Tech Feb 3, 2026 3 min

How I looked into better pose and angle control in Qwen Image Edit

A comparison of the Nunchaku quantized build, VNCCS Pose Studio, and the official 2511 model improvements to find better ways to control pose and camera angle.

Qwen ComfyUI Image Generation