#SwiftLM

2 articles

Tech Apr 25, 2026 11 min

Running a Non-Qwen MoE on SwiftLM: Ling-flash-2.0 MXFP4 on M1 Max 64GB

Feeding inclusionAI's Ling-flash-2.0 (bailing_moe, 100B / 6.1B active, MXFP4 quantization) into SwiftLM on an M1 Max 64GB. Covers the mlx-swift-lm bailing_moe and MXFP4 support check, the startup surprise, and what --stream-experts actually does.

Apple Silicon LLM MLX Local LLM Swift SwiftLM MoE MXFP4 Ant Group Experiment

Tech Apr 24, 2026 13 min

Running SwiftLM on M1 Max 64GB and Comparing It to Ollama and MLX-lm

A hands-on build and run of the Swift-based LLM inference server SwiftLM on an M1 Max 64GB. Covers Qwen3.6-35B-A3B and Qwen3.5-122B-A10B, with the same BST, BBS, and persona tests used in the existing Ollama and MLX-lm write-ups.

Apple Silicon LLM MLX Local LLM Swift SwiftLM MoE Experiment