The three-stage pipeline of BERT perplexity scan → LLM judgment → escalation packaged as a cross-platform Python tool. The installer automatically downloads llama-server and GGUF models.
A summary of the dispute in which chardet's original author argues that the LGPL-to-MIT relicensing was invalid, and the formal launch of the React Foundation under the Linux Foundation. Two very different cross-sections of OSS governance.
Experiment log: from LUKE/BERT fill-mask fine-tuning, to perplexity-based error detection, to Qwen2.5 7B correction judgment with human escalation on mismatch. A complete pipeline running on a single RTX 4060 Laptop with 8GB VRAM.
From Docker hell to Lite + LLM correction. A retrospective on my own experimentation, plus an introduction to someone else's browser-based NDLOCR-Lite implementation.
Set up the CLI version of NDLOCR-Lite on Apple Silicon Mac, then tested OCR result correction with Qwen 3.5 and Swallow. Includes experiments with direct image reading and the anchoring effect.
Tried the lightweight OCR tool NDLOCR-Lite released by the National Diet Library — installed it on Windows 11 and tested both the CLI and GUI versions.
Introducing a Mac mini M4 Pro to build an in-house RAG system. A plan for setting up a LoRA training environment during downtime while waiting for specs to be finalized.
When Layout Parser wouldn't install and NDLOCR alone couldn't handle a 4-column vertical text book, I used PyMuPDF and histogram analysis to brute-force split the columns.