#Chroma

4 articles

TechMay 2, 202623 min

Wiring Up a Multimodal Japanese Local RAG with FastAPI, Chroma, Open WebUI, and Ollama on M1 Max

Hands-on log of building the DEV article's PDF RAG on M1 Max 64GB, extending it with images via CLIP, and pushing through Japanese with bge-m3 + Qwen3.6 35B. Documents the modality gap, the dual inference server crash, and LLM-jp 4-8B's empty chat template silently dropping the system role.

AI LLM RAG ローカルLLM FastAPI llama.cpp Chroma Python Apple Silicon Ollama 日本語LLM 実験

TechMay 2, 2026updated12 min

Reading an Article on Building a Local PDF RAG with FastAPI, llama.cpp, Chroma, and Open WebUI

Notes on a DEV Community article that wires up FastAPI as an OpenAI-compatible RAG API layer with llama.cpp, Chroma, and Open WebUI, plus where the architecture fits and what to watch for.

AI LLM RAG ローカルLLM FastAPI llama.cpp Chroma Python Docker

TechApr 4, 202614 min

Mintlify ditched RAG and switched to a virtual file system

From the basics of RAG and vector databases to Mintlify's design and implementation of ChromaFs, a virtual file system that converts UNIX commands into ChromaDB queries.

RAG Chroma AI TypeScript Documentation

TechMar 27, 20268 min

Chroma Context-1 achieves search performance equivalent to Frontier LLM with 20B parameters

A self-editing search agent with 20B parameters published by Chroma. It performs multi-hop search while dynamically pruning the context, and shows the same or higher accuracy than the Frontier model at 1/10 the cost and up to 10 times faster latency. Weights are exposed in Apache 2.0.

Chroma search agent Reinforcement Learning RAG LLM