#Vulkan

5 articles

TechApr 3, 20268 min

Running Lemonade on Strix Halo (EVO-X2): Vulkan Shared Memory Leaks and ROCm Stability

Real-world testing of AMD Lemonade v10.0.1 on Ryzen AI Max+ 395. LLM, image generation, speech recognition, and TTS running simultaneously, NPU Hybrid execution, Vulkan vs ROCm benchmarks, and discovering shared memory leaks.

AMD Local LLM Vulkan ROCm NPU llama.cpp GPU Inference Optimization Benchmark Experiment

TechApr 3, 20268 min

AMD's Lemonade Local AI Server Bundles GPU, NPU, and Multi-Modal Inference Under One Roof

Lemonade is AMD's open-source local AI server that manages multiple backends like llama.cpp and FastFlowLM across GPU/NPU/CPU, serving text, image, and audio generation through an OpenAI-compatible API.

AMD Local LLM NPU GPU llama.cpp Inference Optimization ROCm Vulkan

TechMar 31, 2026updated8 min

Qwen3.5-35B-A3B on llama-server (Vulkan + Strix Halo): 4K → 65K context for only 800MB more VRAM

Qwen3.5-35B-A3B is an SSM+Attention hybrid where only 10 of 40 layers consume KV cache. Going from ctx-size 4096 to 65536 on llama-server + Vulkan added just 800MB VRAM with zero throughput loss. Tested on Strix Halo (Ryzen AI Max+ 395), with q8_0 KV quant benchmarks.

LLM Local LLM llama.cpp AMD Vulkan KV Cache Qwen Benchmark

TechMar 28, 2026updated14 min

Radeon 8060S (gfx1151) Vulkan Broke Again After AMD Driver Update

After updating to AMD Software 26.3.1 on a GMKtec EVO-X2 (Ryzen AI Max+ 395), Vulkan backend fails to allocate device memory properly and falls back to CPU. Investigation and workaround by changing BIOS VRAM allocation from 48GB/16GB to 32GB/32GB.

AMD Vulkan GPU llama.cpp LLM Experiment

TechFeb 28, 2026updated12 min

Qwen 3.5 abliterated in Ollama: broken outputs, chat-template failures, and the official-model workaround

Hands-on test of huihui-ai Qwen 3.5 abliterated models in Ollama: garbage-token failures, GLM-4.7-Flash chat-template breakage, and why the official model with thinking disabled worked better.

AI LLM Ollama Local LLM AMD LM Studio Vulkan ROCm Experiment