DeepSeek-Prover-V2-671B

Introduction

DeepSeek-Prover-V2 is an open-source large language model specifically designed for formal theorem proving in Lean 4. The model's development began with a recursive theorem proving pipeline powered by DeepSeek-V3, which collected initialization data through a sophisticated process of problem decomposition and proof synthesis.

Model Summary

Synthesize Cold-Start Reasoning Data

The development process involved:

Using DeepSeek-V3 to decompose theorems into high-level proof sketches
Formalizing proof steps in Lean 4
Creating a sequence of subgoals
Using a smaller 7B model for proof search of each subgoal
Pairing complete formal proofs with DeepSeek-V3's chain-of-thought

Reinforcement Learning

The model underwent a reinforcement learning stage to enhance its ability to bridge informal reasoning with formal proof construction. This process resulted in:

88.9% pass ratio on the MiniF2F-test
Solving 49 out of 658 problems from PutnamBench
Improved ability to combine informal and formal mathematical reasoning

ProverBench: Formalization of AIME and Textbook Problems

The model includes ProverBench, a benchmark dataset comprising 325 problems:

Area	Count
AIME 24&25	15
Number Theory	40
Elementary Algebra	30
Linear Algebra	50
Abstract Algebra	40
Calculus	90
Real Analysis	30
Complex Analysis	10
Functional Analysis	10
Probability	10

Quick Start

You can use Huggingface's Transformers for model inference. Here's a basic example:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
torch.manual_seed(30)

model_id = "DeepSeek-Prover-V2-7B"  # or DeepSeek-Prover-V2-671B
tokenizer = AutoTokenizer.from_pretrained(model_id)

formal_statement = """
import Mathlib
import Aesop

set_option maxHeartbeats 0

open BigOperators Real Nat Topology Rat

/-- What is the positive difference between $120\%$ of 30 and $130\%$ of 20? Show that it is 10.-/
theorem mathd_algebra_10 : abs ((120 : ℝ) / 100 * 30 - 130 / 100 * 20) = 10 := by
  sorry
""".strip()

prompt = """
Complete the following Lean 4 code:

```lean4
{}
```

Before producing the Lean 4 code to formally prove the given theorem, provide a detailed proof plan outlining the main proof steps and strategies.
The plan should highlight key ideas, intermediate lemmas, and proof structures that will guide the construction of the final formal proof.
""".strip()

chat = [
  {"role": "user", "content": prompt.format(formal_statement)},
]

model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
inputs = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

import time
start = time.time()
outputs = model.generate(inputs, max_new_tokens=8192)
print(tokenizer.batch_decode(outputs))
print(time.time() - start)

Model & Dataset Downloads

DeepSeek-Prover-V2 is available in two model sizes:

The ProverBench dataset is available at:

DeepSeek-ProverBench Dataset