BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Models

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

DeepSeek-V4 slashes inference costs to 27% of its predecessor while scaling to million-token context, demonstrating major efficiency gains for practical long-context LLMs.

Friday, April 24, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline

DeepSeek releases the V4 model series featuring Mixture-of-Experts variants: DeepSeek-V4-Pro (1.6T total, 49B activated) and DeepSeek-V4-Flash (284B total, 13B activated), both supporting million-token context. Key technical innovations include Hybrid Attention Architecture combining Compressed Sparse Attention and Heavily Compressed Attention—V4-Pro requires only 27% of inference FLOPs and 10% KV cache versus V3.2. The models employ the Muon optimizer and undergo two-stage post-training combining domain-specific expert cultivation with unified consolidation via on-policy distillation.

Tags
models
/// RELATED