The three-stage pipeline of BERT perplexity scan → LLM judgment → escalation packaged as a cross-platform Python tool. The installer automatically downloads llama-server and GGUF models.
Experiment log: from LUKE/BERT fill-mask fine-tuning, to perplexity-based error detection, to Qwen2.5 7B correction judgment with human escalation on mismatch. A complete pipeline running on a single RTX 4060 Laptop with 8GB VRAM.
Step-by-step guide to building a LoRA training environment on Windows 11 with an RTX 3060 Laptop (6 GB VRAM) using kohya_ss — from caption writing to VRAM-saving settings.