Minju Seo, Jinheon Baek†, Seongyun Lee, and Sung Ju Hwang† († denotes equal advising)
International Conference on Learning Representations (ICLR), 2026
📄 Read the paper

PaperCoder is the multi-agent LLM system introduced in Paper2Code, designed to transform a paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents. Our method outperforms strong baselines on both Paper2Code and PaperBench and produces faithful, high-quality implementations.
pip install openai
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run.sh
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.pip install vllm
cd scripts
bash run_llm.sh
outputs
├── Transformer
│ ├── analyzing_artifacts
│ ├── coding_artifacts
│ └── planning_artifacts
└── Transformer_repo # Final output repository
o3-mini version, make sure you have the latest openai package installed.openai.vllm.pip install openai
pip install vllm
pip:pip install -r requirements.txt
The following process describes how to convert a paper PDF into JSON format.
If you have access to the LaTeX source and plan to use it with PaperCoder, you may skip this step and proceed to 🚀 Running PaperCoder.
Note: In our experiments, we converted all paper PDFs to JSON format.
s2orc-doc2json repository to convert your PDF file into a structured JSON format.git clone https://github.com/allenai/s2orc-doc2json.git
cd ./s2orc-doc2json/grobid-0.7.3
./gradlew run
mkdir -p ./s2orc-doc2json/output_dir/paper_coder
python ./s2orc-doc2json/doc2json/grobid2json/process_pdf.py \
-i ${PDF_PATH} \
-t ./s2orc-doc2json/temp_dir/ \
-o ./s2orc-doc2json/output_dir/paper_coder
# Using the PDF-based JSON format of the paper
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run.sh
# Using the LaTeX source of the paper
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run_latex.sh
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.# Using the PDF-based JSON format of the paper
cd scripts
bash run_llm.sh
# Using the LaTeX source of the paper
cd scripts
bash run_latex_llm.sh
Huggingface dataset: paper2code
You can find the description of the Paper2Code benchmark dataset in data/paper2code.
We evaluate repository quality using a model-based approach, supporting both reference-based and reference-free settings.
The model critiques key implementation components, assigns severity levels, and generates a 1–5 correctness score averaged over 8 samples using o3-mini-high.
For more details, please refer to Section 4.3.1 (Paper2Code Benchmark) of the paper.
pip install tiktoken
export OPENAI_API_KEY="<OPENAI_API_KEY>"
target_repo_dir is the generated repository.cd codes/
python eval.py \
--paper_name Transformer \
--pdf_json_path ../examples/Transformer_cleaned.json \
--data_dir ../data \
--output_dir ../outputs/Transformer \
--target_repo_dir ../outputs/Transformer_repo \
--eval_result_dir ../results \
--eval_type ref_free \
--generated_n 8 \
--papercoder
target_repo_dir is the generated repository.gold_repo_dir should point to the official repository (e.g., author-released code).cd codes/
python eval.py \
--paper_name Transformer \
--pdf_json_path ../examples/Transformer_cleaned.json \
--data_dir ../data \
--output_dir ../outputs/Transformer \
--target_repo_dir ../outputs/Transformer_repo \
--gold_repo_dir ../examples/Transformer_gold_repo \
--eval_result_dir ../results \
--eval_type ref_based \
--generated_n 8 \
--papercoder
========================================
🌟 Evaluation Summary 🌟
📄 Paper name: Transformer
🧪 Evaluation type: ref_based
📁 Target repo directory: ../outputs/Transformer_repo
📊 Evaluation result:
📈 Score: 4.5000
✅ Valid: 8/8
========================================
🌟 Usage Summary 🌟
[Evaluation] Transformer - ref_based
🛠️ Model: o3-mini
📥 Input tokens: 44318 (Cost: $0.04874980)
📦 Cached input tokens: 0 (Cost: $0.00000000)
📤 Output tokens: 26310 (Cost: $0.11576400)
💵 Current total cost: $0.16451380
🪙 Accumulated total cost so far: $0.16451380
============================================
$ claude mcp add Paper2Code \
-- python -m otcore.mcp_server <graph>