Hardware requirements to run Qwen-Image-Edit-2511 locally
I was curious about the Multi-Angle LoRA for Qwen-Image-Edit-2511, so I researched the specs needed to run it locally.
What is Qwen-Image-Edit-2511?
An image editing model released by Qwen (Tongyi Qianwen). It supports text-guided image editing, inpainting, outpainting, and more.
What I’m particularly interested in here is fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA. It’s a LoRA that can re-generate images by specifying 96 camera angles (4 elevations × 8 azimuths × 3 distances). It’s trained on 3,000+ Gaussian Splatting rendered images.
Model overview
| Item | Value |
|---|---|
| Parameter count | 20B |
| Base model size | 57.7GB |
| LoRA size | Several hundred MB |
Requirements by quantization level
Quantization saves VRAM at the cost of some quality.
| Quantization | Model size | Required VRAM | Quality |
|---|---|---|---|
| FP32 | 57.7GB | 40GB+ | Highest |
| BF16 | ~30GB | 24GB+ | High |
| FP8 | ~15GB | 6GB+ | FP16-equivalent |
| NF4 | ~10GB | 16–20GB | Good |
| GGUF Q4_K_M | 13.1GB | CPU runnable | Practical |
FP8 quantization maintains roughly FP16-level quality while halving VRAM, making it a good balance.
Recommended specs for Windows
Minimum build (FP8 quantization)
| Part | Spec |
|---|---|
| GPU | RTX 3060 12GB or higher |
| RAM | 32GB |
| Storage | SSD with 100GB free |
| OS | Windows 10/11 64-bit |
Works with the FP8 build + Lightning LoRA (4-step inference). Slow, but it runs.
Recommended build
| Part | Spec |
|---|---|
| GPU | RTX 4070 Ti 16GB / RTX 4070 Ti Super 16GB |
| RAM | 64GB |
| Storage | NVMe SSD with 200GB+ free |
Practical speed with NF4 or FP8.
Comfort build
| Part | Spec |
|---|---|
| GPU | RTX 4090 24GB |
| RAM | 64GB |
| Storage | NVMe SSD with 200GB+ free |
Comfortable on FP8. BF16 is also in reach.
High-end build
| Part | Spec |
|---|---|
| GPU | RTX 4090 24GB × 2 / A100 40GB |
| RAM | 128GB |
| Storage | NVMe SSD with 500GB+ free |
Runs at full precision (FP32/BF16). Suited for production workloads.
Recommended specs for Mac
Apple Silicon uses unified memory, so capacity matters. Intel Macs are not recommended.
Minimum build
| Part | Spec |
|---|---|
| Chip | M1 Pro / M2 Pro |
| Memory | 32GB |
| Storage | 256GB+ free |
GGUF-quantized builds only. It runs, but slowly.
Recommended build
| Part | Spec |
|---|---|
| Chip | M3 Max / M4 Max |
| Memory | 64GB |
| Storage | 512GB+ free |
GGUF builds reach practical speed.
Comfort build
| Part | Spec |
|---|---|
| Chip | M3 Ultra / M4 Ultra |
| Memory | 128GB+ |
| Storage | 1TB+ |
Plenty of headroom even with larger quantization.
Rough costs (as of January 2026)
DRAM prices surged since late 2024: DDR5 64GB (32GB×2) exceeds ¥100,000, and 128GB (64GB×2) is ¥170,000–¥250,000. GPUs remain scarce due to AI demand, with the RTX 4090 around ¥450,000. Some forecasts say normalization may not come until 2027–2028.
Windows
| Build | GPU | Memory | Approx. total PC cost |
|---|---|---|---|
| Minimum | RTX 3060 12GB | 32GB | ¥200,000–¥250,000 |
| Recommended | RTX 4070 Ti Super 16GB | 64GB | ¥500,000–¥600,000 |
| Comfort | RTX 4090 24GB | 64GB | ¥700,000–¥800,000 |
Reference GPU prices:
- RTX 3060 12GB: ¥40,000–¥50,000
- RTX 4070 Ti Super 16GB: from ¥240,000
- RTX 4090 24GB: from ¥450,000
Mac
| Build | Model | Approx. |
|---|---|---|
| Minimum | MacBook Pro M3 Pro 36GB | ¥400,000+ |
| Recommended | MacBook Pro M3 Max 64GB | ¥550,000+ |
| Comfort | Mac Studio M3 Ultra 128GB | ¥800,000+ |
Runtime environment
Required software
- Python 3.10+
- ComfyUI (recommended) or Diffusers
- CUDA Toolkit (when using NVIDIA GPU on Windows)
Speed-up options
- Lightning LoRA: reduce inference steps 40→4
- SageAttention: lower memory usage
- FP8 mixed precision: save VRAM while keeping quality
Cloud GPU as an option
If you’re about to spend ¥700,000 building a local box, cloud GPUs are worth considering.
RunPod pricing (January 2026)
| GPU | Community Cloud | Secure Cloud | Discount (Spot) |
|---|---|---|---|
| RTX 4090 | /bin/zsh.34/hour (¥51) | /bin/zsh.59/hour (¥89) | ~50% off |
| RTX 4070 | /bin/zsh.08/hour (¥12) | - | ~50% off |
| RTX 3090 | /bin/zsh.22/hour (¥33) | /bin/zsh.46/hour (¥69) | ~50% off |
Community Cloud is the cheapest. Spot instances can be interrupted with 5 seconds’ notice. Ready-made ComfyUI templates available.
Vast.ai pricing (January 2026, marketplace)
| GPU | Lowest | Average | JPY equivalent |
|---|---|---|---|
| RTX 4090 | /bin/zsh.24/hour | ~/bin/zsh.30/hour | ¥36–45 |
| RTX 4070 | /bin/zsh.08/hour | ~/bin/zsh.12/hour | ¥12–18 |
| RTX 3090 | /bin/zsh.13/hour | ~/bin/zsh.15/hour | ¥19.5–22.5 |
It’s a peer-to-peer marketplace, so prices fluctuate. Linux Docker only. Compared with AWS/GCP it’s 5–6× cheaper.
Cost comparison (per hour)
| GPU | Cloud (lowest) | Local purchase price | Break-even |
|---|---|---|---|
| RTX 4090 | ¥36/hour | ¥450,000 (GPU) | 12,500 hours (3 hours/day → 11 years) |
| RTX 4070 | ¥12/hour | ¥150,000 (GPU) | 12,500 hours (3 hours/day → 11 years) |
| RTX 3090 | ¥19.5/hour | ¥100,000 (used) | 5,100 hours (3 hours/day → 4.5 years) |
Worked example: one Qwen-Image-Edit inference
If one Qwen-Image-Edit-2511 inference (40 steps) takes about 5 minutes:
- RTX 4090 in the cloud: /bin/zsh.30/hour × (5/60) = ~¥2.5 per run
- Local RTX 4090: ¥450,000 purchase ÷ 12,500 hours = ~¥36/hour × 5 minutes = ¥3 per run (year-1 depreciation)
For light users, cloud wins by a wide margin.
Summary
As of January 2026, soaring memory and GPU prices have driven up the cost of building a local AI box.
- Minimum build (RTX 3060 + 32GB): ¥200,000–¥250,000
- Recommended build (RTX 4070 Ti Super + 64GB): ¥500,000–¥600,000
- Comfort build (RTX 4090 + 64GB): ¥700,000–¥800,000