Tech 3 min read

Z-Image-Distilled - a Z-Image derivative that keeps diversity while speeding up inference

IkesanContents

what Z-Image-Distilled is

It is a derivative model based on Z-Image that speeds up inference through distillation.

This is a “pure” distilled model. Unlike Z-Image-Turbo, it does not include Turbo weights or style.

basic specs

itemZ-Image (original)Z-Image-Distilled
recommended steps28-5010-20
CFG3.0-7.01.0-2.5
diversityhighmedium, but better than Turbo
LoRA compatibilityhighhigh
licenseApache-2.0Apache-2.0

Because good results show up in 10 to 20 steps, it can generate in less than half the time of the original.

CFG: 1.0-1.8 (higher values improve prompt adherence)
Steps: 10 (preview), 15-20 (stable quality)
Sampler: Euler, simple, res_m
LoRA weight: 0.6-1.0

distilled model comparison: Schnell vs Distilled

I wrote in the FLUX.2 Klein article that distillation often reduces diversity. FLUX.1 Schnell is the classic example.

modelapproachdiversityspeed
FLUX.2 Kleinparameter reduction without distillationhighsomewhat slow
FLUX.1 Schnelldistilled for speedlowfast
Z-Image-Distilleddistilled for speedmediumfast

The claim is that Z-Image-Distilled keeps diversity even after distillation. In practice, its good LoRA training compatibility suggests it still has plenty of flexibility as a base model.

It is a little slower than Turbo, but if you care about diversity and LoRA compatibility, it is a solid option.

can it run on an M1 Max 64GB machine?

Short answer: yes. Easily.

requirements

  • Z-Image Turbo (bf16): 12-16GB VRAM
  • Z-Image-Distilled is assumed to be similar

on an M1 Max 64GB

itemstatus
unified memory64GB
GPU availableabout 48GB, with a 75% limit
model requirement12-16GB
headroomplenty

It is much lighter than FLUX.2 Klein, which needs 29GB.

if you still need to reduce memory

If memory is still tight:

  • GGUF quantization can run on 6GB VRAM
  • stable-diffusion.cpp can run on 4GB VRAM in a pure C++ implementation

On an M1 Max 64GB machine, the full model should be fine without quantization.

known limitations

weaker text rendering

Distillation hurts text rendering quality, especially for small text. It is not a good fit if you want to generate logos or signs.

color cast

Some samplers can make the output skew bluish. Changing the sampler or adjusting the prompt helps.

using it in ComfyUI

It is ComfyUI-compatible. The layer prefix is model.diffusion_model.

It supports both Chinese and English prompts.