TechMar 22, 202613 minTogether AI announces Mamba-3: ~7x faster long-context inference than Transformers, with complex-valued SSMRedesigned with inference latency as the first priority, Mamba‑3 combines exponential trapezoid discretization, complex‑valued states, and a MIMO structure to reach about 6.9× the speed of a Transformer at 16,384 tokens.SSMLLMInferenceArchitecture