Hugging Face articles | lilting channel

TechMay 26, 202615 min

Hy-MT2 1.8B Q4_K_M on M1 Max 64GB: 1.25bit 440MB build does not load on stock llama.cpp yet

Hands-on with Tencent Hy-MT2 1.8B Q4_K_M (1.08GB) on M1 Max 64GB via llama-server. JSON, SRT, HTML, glossary, and minority-language prompts with full input-output pairs. The 1.25bit 440MB build does not load on stock llama.cpp 8990, and 30B-A3B (hy_v3) is not in the Mac route yet.

AI LLM Translation Local LLM Hugging Face Quantization MoE Open Source Mac Apple Silicon Experiment

#Hugging Face

Hy-MT2 1.8B Q4_K_M on M1 Max 64GB: 1.25bit 440MB build does not load on stock llama.cpp yet