40GB+ VRAM for a 3B model. VBench 85.11 beats dedicated 14B video generators. RunPod GPU costs from $2.2/session. The 'unified' model still ships as two checkpoint files.
Sentence Transformers v5.4 adds multimodal support. Eight embedding models and four rerankers including Qwen3-VL and NVIDIA Nemotron can now be used through a unified API.