Lemonade is AMD's open-source local AI server that manages multiple backends like llama.cpp and FastFlowLM across GPU/NPU/CPU, serving text, image, and audio generation through an OpenAI-compatible API.
All variants of huihui-ai's Qwen 3.5 abliterated produced garbage tokens. GLM-4.7-Flash abliterated had a broken chat template. The official version with thinking disabled turned out to be the right answer.