Tech 4 min read

Hardware requirements to run Qwen-Image-Edit-2511 locally

I was curious about the Multi-Angle LoRA for Qwen-Image-Edit-2511, so I researched the specs needed to run it locally.

What is Qwen-Image-Edit-2511?

An image editing model released by Qwen (Tongyi Qianwen). It supports text-guided image editing, inpainting, outpainting, and more.

What I’m particularly interested in here is fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA. It’s a LoRA that can re-generate images by specifying 96 camera angles (4 elevations × 8 azimuths × 3 distances). It’s trained on 3,000+ Gaussian Splatting rendered images.

Model overview

ItemValue
Parameter count20B
Base model size57.7GB
LoRA sizeSeveral hundred MB

Requirements by quantization level

Quantization saves VRAM at the cost of some quality.

QuantizationModel sizeRequired VRAMQuality
FP3257.7GB40GB+Highest
BF16~30GB24GB+High
FP8~15GB6GB+FP16-equivalent
NF4~10GB16–20GBGood
GGUF Q4_K_M13.1GBCPU runnablePractical

FP8 quantization maintains roughly FP16-level quality while halving VRAM, making it a good balance.

Minimum build (FP8 quantization)

PartSpec
GPURTX 3060 12GB or higher
RAM32GB
StorageSSD with 100GB free
OSWindows 10/11 64-bit

Works with the FP8 build + Lightning LoRA (4-step inference). Slow, but it runs.

PartSpec
GPURTX 4070 Ti 16GB / RTX 4070 Ti Super 16GB
RAM64GB
StorageNVMe SSD with 200GB+ free

Practical speed with NF4 or FP8.

Comfort build

PartSpec
GPURTX 4090 24GB
RAM64GB
StorageNVMe SSD with 200GB+ free

Comfortable on FP8. BF16 is also in reach.

High-end build

PartSpec
GPURTX 4090 24GB × 2 / A100 40GB
RAM128GB
StorageNVMe SSD with 500GB+ free

Runs at full precision (FP32/BF16). Suited for production workloads.

Apple Silicon uses unified memory, so capacity matters. Intel Macs are not recommended.

Minimum build

PartSpec
ChipM1 Pro / M2 Pro
Memory32GB
Storage256GB+ free

GGUF-quantized builds only. It runs, but slowly.

PartSpec
ChipM3 Max / M4 Max
Memory64GB
Storage512GB+ free

GGUF builds reach practical speed.

Comfort build

PartSpec
ChipM3 Ultra / M4 Ultra
Memory128GB+
Storage1TB+

Plenty of headroom even with larger quantization.

Rough costs (as of January 2026)

DRAM prices surged since late 2024: DDR5 64GB (32GB×2) exceeds ¥100,000, and 128GB (64GB×2) is ¥170,000–¥250,000. GPUs remain scarce due to AI demand, with the RTX 4090 around ¥450,000. Some forecasts say normalization may not come until 2027–2028.

Windows

BuildGPUMemoryApprox. total PC cost
MinimumRTX 3060 12GB32GB¥200,000–¥250,000
RecommendedRTX 4070 Ti Super 16GB64GB¥500,000–¥600,000
ComfortRTX 4090 24GB64GB¥700,000–¥800,000

Reference GPU prices:

  • RTX 3060 12GB: ¥40,000–¥50,000
  • RTX 4070 Ti Super 16GB: from ¥240,000
  • RTX 4090 24GB: from ¥450,000

Mac

BuildModelApprox.
MinimumMacBook Pro M3 Pro 36GB¥400,000+
RecommendedMacBook Pro M3 Max 64GB¥550,000+
ComfortMac Studio M3 Ultra 128GB¥800,000+

Runtime environment

Required software

  • Python 3.10+
  • ComfyUI (recommended) or Diffusers
  • CUDA Toolkit (when using NVIDIA GPU on Windows)

Speed-up options

  • Lightning LoRA: reduce inference steps 40→4
  • SageAttention: lower memory usage
  • FP8 mixed precision: save VRAM while keeping quality

Cloud GPU as an option

If you’re about to spend ¥700,000 building a local box, cloud GPUs are worth considering.

RunPod pricing (January 2026)

GPUCommunity CloudSecure CloudDiscount (Spot)
RTX 4090/bin/zsh.34/hour (¥51)/bin/zsh.59/hour (¥89)~50% off
RTX 4070/bin/zsh.08/hour (¥12)-~50% off
RTX 3090/bin/zsh.22/hour (¥33)/bin/zsh.46/hour (¥69)~50% off

Community Cloud is the cheapest. Spot instances can be interrupted with 5 seconds’ notice. Ready-made ComfyUI templates available.

Vast.ai pricing (January 2026, marketplace)

GPULowestAverageJPY equivalent
RTX 4090/bin/zsh.24/hour~/bin/zsh.30/hour¥36–45
RTX 4070/bin/zsh.08/hour~/bin/zsh.12/hour¥12–18
RTX 3090/bin/zsh.13/hour~/bin/zsh.15/hour¥19.5–22.5

It’s a peer-to-peer marketplace, so prices fluctuate. Linux Docker only. Compared with AWS/GCP it’s 5–6× cheaper.

Cost comparison (per hour)

GPUCloud (lowest)Local purchase priceBreak-even
RTX 4090¥36/hour¥450,000 (GPU)12,500 hours (3 hours/day → 11 years)
RTX 4070¥12/hour¥150,000 (GPU)12,500 hours (3 hours/day → 11 years)
RTX 3090¥19.5/hour¥100,000 (used)5,100 hours (3 hours/day → 4.5 years)

Worked example: one Qwen-Image-Edit inference

If one Qwen-Image-Edit-2511 inference (40 steps) takes about 5 minutes:

  • RTX 4090 in the cloud: /bin/zsh.30/hour × (5/60) = ~¥2.5 per run
  • Local RTX 4090: ¥450,000 purchase ÷ 12,500 hours = ~¥36/hour × 5 minutes = ¥3 per run (year-1 depreciation)

For light users, cloud wins by a wide margin.

Summary

As of January 2026, soaring memory and GPU prices have driven up the cost of building a local AI box.

  • Minimum build (RTX 3060 + 32GB): ¥200,000–¥250,000
  • Recommended build (RTX 4070 Ti Super + 64GB): ¥500,000–¥600,000
  • Comfort build (RTX 4090 + 64GB): ¥700,000–¥800,000