DeepSeek-Prover-V2

Technical Guide Published: May 1, 2025

An open-source large language model designed for formal theorem proving in Lean 4, with initialization data collected through a recursive theorem proving pipeline.

Introduction

DeepSeek-Prover-V2 is an open-source large language model specifically designed for formal theorem proving in Lean 4. The model's development began with a recursive theorem proving pipeline powered by DeepSeek-V3, which collected initialization data through a sophisticated process of problem decomposition and proof synthesis.

Model Summary

Synthesize Cold-Start Reasoning Data

The development process involved:

  • Using DeepSeek-V3 to decompose theorems into high-level proof sketches
  • Formalizing proof steps in Lean 4
  • Creating a sequence of subgoals
  • Using a smaller 7B model for proof search of each subgoal
  • Pairing complete formal proofs with DeepSeek-V3's chain-of-thought

Reinforcement Learning

The model underwent a reinforcement learning stage to enhance its ability to bridge informal reasoning with formal proof construction. This process resulted in:

  • 88.9% pass ratio on the MiniF2F-test
  • Solving 49 out of 658 problems from PutnamBench
  • Improved ability to combine informal and formal mathematical reasoning

ProverBench: Formalization of AIME and Textbook Problems

The model includes ProverBench, a benchmark dataset comprising 325 problems:

Area Count
AIME 24&25 15
Number Theory 40
Elementary Algebra 30
Linear Algebra 50
Abstract Algebra 40
Calculus 90
Real Analysis 30
Complex Analysis 10
Functional Analysis 10
Probability 10

Quick Start

You can use Huggingface's Transformers for model inference. Here's a basic example:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
torch.manual_seed(30)

model_id = "DeepSeek-Prover-V2-7B"  # or DeepSeek-Prover-V2-671B
tokenizer = AutoTokenizer.from_pretrained(model_id)

formal_statement = """
import Mathlib
import Aesop

set_option maxHeartbeats 0

open BigOperators Real Nat Topology Rat

/-- What is the positive difference between $120\%$ of 30 and $130\%$ of 20? Show that it is 10.-/
theorem mathd_algebra_10 : abs ((120 : ℝ) / 100 * 30 - 130 / 100 * 20) = 10 := by
  sorry
""".strip()

prompt = """
Complete the following Lean 4 code:

```lean4
{}
```

Before producing the Lean 4 code to formally prove the given theorem, provide a detailed proof plan outlining the main proof steps and strategies.
The plan should highlight key ideas, intermediate lemmas, and proof structures that will guide the construction of the final formal proof.
""".strip()

chat = [
  {"role": "user", "content": prompt.format(formal_statement)},
]

model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
inputs = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

import time
start = time.time()
outputs = model.generate(inputs, max_new_tokens=8192)
print(tokenizer.batch_decode(outputs))
print(time.time() - start)

Model & Dataset Downloads

DeepSeek-Prover-V2 is available in two model sizes:

The ProverBench dataset is available at: