Tested local Wan video gen on a Radeon 8060S (Strix Halo, 48GB UMA, Windows). ZLUDA can't run stock PyTorch; AMD's TheRock gfx1151 wheel gives native ROCm. FastWan 1.3B in 4min, Wan 14B I2V in 13.6min — VAE decode and 16GB-RAM Segfaults are the real limits.
Tested FramePack F1 on an RTX 4060 Laptop (8GB VRAM, 32GB RAM): VRAM peaked at 5.75GB, but the 26GB model overflowed RAM into the pagefile and a 5s clip took 56 min. The real bottleneck for local video gen on a laptop is RAM, not VRAM.
Alibaba ATH's video generation model HappyHorse-1.0: API test status on Model Studio, open weights availability, Mac local inference reality, and which GPU to pick on RunPod.
WAN 2.2 image-to-video on Windows + RTX 4060 8GB VRAM in ComfyUI. The 5B fp8 model failed three times; the 14B Rapid distilled model with --lowvram offloading produced a 2-second clip in 111 seconds — vs 82 minutes on M1 Max 64GB. Working setup and what to avoid.
Local video generation test on M1 Max 64GB MacBook Pro. FP8 models don't work on Metal — switching to GGUF got Wan 2.2 running at 82 minutes for a 2-second clip. LTX-2 produced NaN or unusable KSampler output under MPS. Specs, failed configs, and the working setup.