DeepSeek-V4-Pro-DSpark isn't a new base model. It's the same 1.6T V4-Pro checkpoint plus a DSpark speculative-decoding head (~893GB). What config.json and the DeepSpec repo reveal, and why there's no speed benchmark yet.
After a US order pulled Claude Fable 5, which Chinese models drop into Claude Code? Kimi K2.7 Code, Qwen3.7 Max, DeepSeek V4 and GLM-5.1 — constraints, VRAM, benchmark caveats.
An arXiv paper reports that fine-tuning GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1 on summary-to-text expansion tasks increases verbatim reproduction of copyrighted books.
DeepSeek V4 Preview ships V4-Pro (1.6T/49B active) and V4-Flash (284B/13B active) as open weights under MIT, both with 1M context. CSA+HCA hybrid attention, mHC, and the Muon optimizer cut per-token FLOPs at 1M tokens to 27% of V3.2. Day-one API and chat.deepseek.com mode switch covered.
Anthropic accused three Chinese AI companies of distilling Claude, and on the same day OpenAI retired SWE-bench Verified. Training fraud and evaluation flaws exposed simultaneously on February 23, 2026.
An explanation of the difference between conventional OCR and VLM (vision-language model) based OCR. Introduces DeepSeek-OCR and explores the possibility of combining both approaches.