Foundation Model articles | lilting channel

TechFeb 26, 20267 min

FDM-1: trained on 11 million hours of video, with a 50x more efficient video encoder

Standard Intelligence trained a general-purpose computer action foundation model on 11 million hours of screen recordings. Instead of an LLM, FDM-1 operates directly on video and action tokens, achieving 50-100x compression efficiency over existing VLMs with a custom encoder.

AI Computer Use Foundation Model

#Foundation Model

FDM-1: trained on 11 million hours of video, with a 50x more efficient video encoder