Gemini API File Search now indexes images alongside text in the same store. Metadata filters can isolate NPC memories by chapter and character, and a single-character prototype costs under $1/month on Flash-Lite. Notes on tier limits, pricing breakdown, and what to test first.
158K lines of AI-generated C# for a Cities: Skylines II total conversion mod. CivicRAG for codebase indexing, 300+ custom Roslyn analyzers as compile-time design rules, and manual visual debugging for render bugs AI couldn't see.
Vektor Memory v1.5.4 supersession chains positioned against YourMemory decay, Cloudflare key-overwrite, and CTX, with a BM25 vs cosine threshold trap and a 5-field minimum schema for agent memory.
The paper argues that RAG, vector stores, and scratchpads are retrieval, not learning. Read alongside CTX and OCR-Memory, the gap between 'better search' and 'weight-level learning' becomes concrete.
A read of CTX, which auto-injects context into Claude Code via the UserPromptSubmit hook. Compared with auto-memory, YourMemory, WUPHF, and Cloudflare Agent Memory on persistence and storage. Also looked at why 1M context still isn't enough and how each agent architecture uses its window differently.
Hands-on log of building the DEV article's PDF RAG on M1 Max 64GB, extending it with images via CLIP, and pushing through Japanese with bge-m3 + Qwen3.6 35B. Documents the modality gap, the dual inference server crash, and LLM-jp 4-8B's empty chat template silently dropping the system role.
Notes on a DEV Community article that wires up FastAPI as an OpenAI-compatible RAG API layer with llama.cpp, Chroma, and Open WebUI, plus where the architecture fits and what to watch for.
A read of arXiv:2604.26622 OCR-Memory. It renders agent execution history into images, uses Set-of-Mark to let a VLM pick relevant segments, then retrieves verbatim text from the original logs.
VecLite is a Rust/WASM+SIMD library that accelerates vector search inside the browser. How far can you get with Transformers.js for embeddings, IndexedDB for storage, and no server at all?
A look at sachitrafa/YourMemory, a local MCP memory server combining Ebbinghaus forgetting curves, BM25, vector search, and graph expansion. LoCoMo-10 Recall@5 currently sits at 59%.
Japan's Digital Agency released parts of Gennai, the generative AI platform it runs for central-government staff, on GitHub under MIT / CC BY 4.0. The web app and cloud-specific AI templates for AWS, Azure, and Google Cloud are bundled together so local governments and private companies can redeploy the same stack.
The NotebookLM clone open-notebook assumes Docker and cloud APIs by default. I installed SurrealDB natively, ran four processes in tmux, and wired everything through Ollama's qwen3.6:35b and bge-m3. I fed it the Qwen3.6 benchmark article I wrote this morning, and it answered with the correct numbers.