DeepSeek-Prover-V2
An open-source large language model designed for formal theorem proving in Lean 4, with initialization data collected through a recursive theorem proving pipeline.
Introduction
DeepSeek-Prover-V2 is an open-source large language model specifically designed for formal theorem proving in Lean 4. The model's development began with a recursive theorem proving pipeline powered by DeepSeek-V3, which collected initialization data through a sophisticated process of problem decomposition and proof synthesis.
Model Summary
Synthesize Cold-Start Reasoning Data
The development process involved:
- Using DeepSeek-V3 to decompose theorems into high-level proof sketches
- Formalizing proof steps in Lean 4
- Creating a sequence of subgoals
- Using a smaller 7B model for proof search of each subgoal
- Pairing complete formal proofs with DeepSeek-V3's chain-of-thought
Reinforcement Learning
The model underwent a reinforcement learning stage to enhance its ability to bridge informal reasoning with formal proof construction. This process resulted in:
- 88.9% pass ratio on the MiniF2F-test
- Solving 49 out of 658 problems from PutnamBench
- Improved ability to combine informal and formal mathematical reasoning
ProverBench: Formalization of AIME and Textbook Problems
The model includes ProverBench, a benchmark dataset comprising 325 problems:
Area | Count |
---|---|
AIME 24&25 | 15 |
Number Theory | 40 |
Elementary Algebra | 30 |
Linear Algebra | 50 |
Abstract Algebra | 40 |
Calculus | 90 |
Real Analysis | 30 |
Complex Analysis | 10 |
Functional Analysis | 10 |
Probability | 10 |
Quick Start
You can use Huggingface's Transformers for model inference. Here's a basic example:
from transformers import AutoModelForCausalLM, AutoTokenizer import torch torch.manual_seed(30) model_id = "DeepSeek-Prover-V2-7B" # or DeepSeek-Prover-V2-671B tokenizer = AutoTokenizer.from_pretrained(model_id) formal_statement = """ import Mathlib import Aesop set_option maxHeartbeats 0 open BigOperators Real Nat Topology Rat /-- What is the positive difference between $120\%$ of 30 and $130\%$ of 20? Show that it is 10.-/ theorem mathd_algebra_10 : abs ((120 : ℝ) / 100 * 30 - 130 / 100 * 20) = 10 := by sorry """.strip() prompt = """ Complete the following Lean 4 code: ```lean4 {} ``` Before producing the Lean 4 code to formally prove the given theorem, provide a detailed proof plan outlining the main proof steps and strategies. The plan should highlight key ideas, intermediate lemmas, and proof structures that will guide the construction of the final formal proof. """.strip() chat = [ {"role": "user", "content": prompt.format(formal_statement)}, ] model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True) inputs = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device) import time start = time.time() outputs = model.generate(inputs, max_new_tokens=8192) print(tokenizer.batch_decode(outputs)) print(time.time() - start)
Model & Dataset Downloads
DeepSeek-Prover-V2 is available in two model sizes:
The ProverBench dataset is available at: