DeepSeek previews new AI model that ‘closes the gap’ with frontier models
2 model and the accompanying R1 reasoning model that took the AI world by storm.
The mixture-of-experts approach involves activating only a certain number of parameters per task to lower inference costs.
The Pro model has a total of 1.
6 trillion parameters (49 billion active), which makes it the biggest open-weight model available, outstripping Moonshot AI’s Kimi K 2. 1 trillion), MiniMax’s M1 (456 billion), and more than double DeepSeek V3.
The smaller, V4 Flash has 284 billion parameters (13 billion active).
DeepSeek says both models are more efficient and performant than DeepSeek V3. 2 due to architectural improvements, and have almost “closed the gap” with current leading models, both open and closed, on reasoning benchmarks.
In coding competition benchmarks, DeepSeek said both V4 models’ performance is “comparable to GPT-5. ” However, the models seem to fall slightly behind frontier models in knowledge tests, specifically OpenAI’s GPT-5.
4 and Google’s latest Gemini 3.
This lag suggests a “developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months,” the lab wrote.
Notably, DeepSeek V4 is much more affordable than any frontier model available today.
The smaller V4 Flash model costs $0. 14 per million input tokens and $0. 28 per million output tokens, undercutting GPT-5. 4 Mini, and Claude Haiku 4. The larger V4 Pro model, meanwhile, costs $0. 145 per million input tokens and $3. 48 per million output tokens, also undercutting Gemini 3
Logic Quality Breakdown:
- Updated_At:
- Truth_Blocks:
- Analysis_Method: