Tech Apr 10, 2026 10 min Sentence Transformers v5.4 Adds Unified Embeddings for Text, Image, Audio, and Video Sentence Transformers v5.4 adds multimodal support. Eight embedding models and four rerankers including Qwen3-VL and NVIDIA Nemotron can now be used through a unified API. AI Embedding Multimodal RAG HuggingFace Python