#DeepSeek

6 articles

TechJun 28, 20268 min

DeepSeek-V4-Pro-DSpark is not a new model but a speculative-decoding V4-Pro

DeepSeek-V4-Pro-DSpark isn't a new base model. It's the same 1.6T V4-Pro checkpoint plus a DSpark speculative-decoding head (~893GB). What config.json and the DeepSpec repo reveal, and why there's no speed benchmark yet.

LLM DeepSeek Chinese AI MoE Inference Optimization Open Model Speculative Decoding

TechJun 16, 202612 min

Claude Fable 5 suspended: Kimi K2.7 Code & Qwen3.7 Max as Claude Code backends

After a US order pulled Claude Fable 5, which Chinese models drop into Claude Code? Kimi K2.7 Code, Qwen3.7 Max, DeepSeek V4 and GLM-5.1 — constraints, VRAM, benchmark caveats.

AI LLM Chinese AI Kimi Qwen DeepSeek MoE AI Agents

TechMay 4, 202611 min

Fine-Tuning Reignites Verbatim Memorization of Copyrighted Books in LLMs

An arXiv paper reports that fine-tuning GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1 on summary-to-text expansion tasks increases verbatim reproduction of copyrighted books.

AI LLM Copyright OpenAI Gemini DeepSeek Fine-tuning Paper

TechApr 24, 2026updated11 min

DeepSeek V4 Preview specs: V4-Pro 1.6T and V4-Flash 284B open under MIT, 1M context, 27% inference FLOPs of V3.2

DeepSeek V4 Preview ships V4-Pro (1.6T/49B active) and V4-Flash (284B/13B active) as open weights under MIT, both with 1M context. CSA+HCA hybrid attention, mHC, and the Muon optimizer cut per-token FLOPs at 1M tokens to 27% of V3.2. Day-one API and chat.deepseek.com mode switch covered.

LLM DeepSeek Chinese AI MoE Open Model AI Agent

TechFeb 24, 20268 min

Large-Scale Unauthorized Distillation of Claude and the Collapse of SWE-bench Hit on the Same Day

Anthropic accused three Chinese AI companies of distilling Claude, and on the same day OpenAI retired SWE-bench Verified. Training fraud and evaluation flaws exposed simultaneously on February 23, 2026.

AI Security Anthropic DeepSeek Benchmark LLM OpenAI SWE-bench

TechJan 20, 20264 min

The rise of VLM-based OCR - DeepSeek-OCR and the potential of hybrid use

An explanation of the difference between conventional OCR and VLM (vision-language model) based OCR. Introduces DeepSeek-OCR and explores the possibility of combining both approaches.

AI OCR DeepSeek VLM