Alibaba ATH's video generation model HappyHorse-1.0: API test status on Model Studio, open weights availability, Mac local inference reality, and which GPU to pick on RunPod.
WAN 2.2 image-to-video on Windows + RTX 4060 8GB VRAM in ComfyUI. The 5B fp8 model produced rough output across three failed attempts; the 14B Rapid distilled model with --lowvram offloading hit 111 seconds per 2-second clip. Working setup and what to avoid.
Local video generation test on M1 Max 64GB: FP8 fails on Metal, GGUF gets Wan 2.2 running at 82 minutes for a 2-second clip, and LTX-2 hits NaN or unusable KSampler output on MPS.