Tech Mar 22, 2026 13 min Together AI announces Mamba-3: ~7x faster long-context inference than Transformers, with complex-valued SSM Redesigned with inference latency as the first priority, Mamba‑3 combines exponential trapezoid discretization, complex‑valued states, and a MIMO structure to reach about 6.9× the speed of a Transformer at 16,384 tokens. SSM LLM Inference Architecture