Hands-on test of huihui-ai Qwen 3.5 abliterated models in Ollama: garbage-token failures, GLM-4.7-Flash chat-template breakage, and why the official model with thinking disabled worked better.
Experiment log: from LUKE/BERT fill-mask fine-tuning, to perplexity-based error detection, to Qwen2.5 7B correction judgment with human escalation on mismatch. A complete pipeline running on a single RTX 4060 Laptop with 8GB VRAM.
Set up the CLI version of NDLOCR-Lite on Apple Silicon Mac, then tested OCR result correction with Qwen 3.5 and Swallow. Includes experiments with direct image reading and the anchoring effect.
A plan to build an internal help desk RAG system using a Mac mini M4 Pro and Dify. Highlights what's new in Dify circa 2025 and tips for running local LLMs.