hub / github.com/deepseek-ai/DeepSpec

github.com/deepseek-ai/DeepSpec @main sqlite

502 symbols 1,623 edges 67 files 14 documented · 3%

README

DeepSpec

DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.

Environment

Install the Python dependencies:

python -m pip install -r requirements.txt

Data preparation additionally requires an inference engine to serve the target model when regenerating answers; see scripts/data/README.md for details.

Workflow

Run the stages in order — each stage's output feeds the next:

Data Preparation — download prompts, regenerate target answers, and build the target cache.
Training — train a draft model against the cached target outputs.
Evaluation — measure speculative-decoding acceptance on benchmark tasks.

Data Preparation

See scripts/data/README.md for the step-by-step data pipeline:

download and split training data,
regenerate answers,
prepare the target cache (storage warning: this can be very large — roughly 38 TB for the default Qwen/Qwen3-4B setting).

Training

bash scripts/train/train.sh

train.sh launches train.py, which spawns one worker per visible GPU. Select the algorithm and target model by pointing config_path at one of the configs under config/ (e.g. config/dspark/dspark_qwen3_4b.py); see the script header for the full list of configs, how to override config_path / target_cache_dir, and how to use --opts to override individual config fields. Checkpoints are written to ~/checkpoints/<project_name>/<exp_name>/step_*.

Hardware: the default configs and scripts assume a single node with 8 GPUs. For fewer GPUs, reduce CUDA_VISIBLE_DEVICES.

Evaluation

bash scripts/eval/eval.sh

eval.sh runs eval.py against a trained draft checkpoint over the speculative-decoding benchmarks in eval_datasets/ (gsm8k, math500, aime25, humaneval, mbpp, livecodebench, mt-bench, alpaca, arena-hard-v2). Set:

target_name_or_path — the target model the draft was trained against (e.g. Qwen/Qwen3-4B),
draft_name_or_path — the draft checkpoint, e.g. ~/checkpoints/deepspec/dspark_block7_qwen3_4b/step_latest, or one of the Hugging Face repo IDs listed in Released Checkpoints.

Released Checkpoints

The checkpoints below are the ones used for Table 1 in the paper. Each checkpoint was trained on open-perfectblend data generated by its corresponding target model in non-thinking mode, and is the direct output of the corresponding training configuration under config/.

Algorithm	`Qwen/Qwen3-4B`	`Qwen/Qwen3-8B`	`Qwen/Qwen3-14B`	`google/gemma-4-12B-it`
Eagle3	deepseek-ai/eagle3_qwen3_4b_ttt7	deepseek-ai/eagle3_qwen3_8b_ttt7	deepseek-ai/eagle3_qwen3_14b_ttt7	deepseek-ai/eagle3_gemma4_12b_ttt7
DFlash	deepseek-ai/dflash_qwen3_4b_block7	deepseek-ai/dflash_qwen3_8b_block7	deepseek-ai/dflash_qwen3_14b_block7	deepseek-ai/dflash_gemma4_12b_block7
DSpark	deepseek-ai/dspark_qwen3_4b_block7	deepseek-ai/dspark_qwen3_8b_block7	deepseek-ai/dspark_qwen3_14b_block7	deepseek-ai/dspark_gemma4_12b_block7

[!IMPORTANT] If you cite these results in a new paper, align your setup with the training settings in this repository; otherwise, the comparison is not meaningful. For domain-specific use, fine-tune the draft model again for better results, especially if the target model is expected to run in thinking mode.

Supported Algorithms

Currently, DeepSpec includes three draft models: DSpark, DFlash and Eagle3.

License

DeepSpec is released under the MIT License. It includes code adapted from third-party projects under their own licenses; see NOTICE for the full attribution.

Acknowledgements

DeepSpec builds on the ideas and code of several excellent open-source projects:

SpecForge (Apache-2.0) — the overall training framework and Eagle3 implementation; portions of the Eagle3 modeling, loss, optimizer, attention, and evaluation code are adapted from it. Adapted files carry an in-file attribution comment, and the full notice is recorded in NOTICE.
DFlash (MIT) — the DFlash draft-model design and training recipe.
Qwen3 and Gemma — the target model families supported in this repo.

We thank the authors and maintainers of these projects. Contributions of new algorithms are welcome.

Core symbols most depended-on inside this repo

get

called by 29

deepspec/data/parser.py

add_metric

called by 24

deepspec/utils/metrics.py

is_global_main_process

called by 14

deepspec/utils/distributed.py

print_on_local_main

called by 14

deepspec/utils/distributed.py

get_prev_embeddings

called by 10

deepspec/modeling/dspark/markov_head.py

sample_tokens

called by 9

deepspec/utils/sampling.py

called by 9

deepspec/data/jsonl_dataset.py

print_on_global_main

called by 8

deepspec/utils/distributed.py

Shape

Function 223

Method 221

Class 58

Languages

Python100%

Modules by API surface

deepspec/data/target_cache_dataset.py59 symbols

deepspec/eval/dspark/confidence_head.py25 symbols

deepspec/eval/base_evaluator.py25 symbols

eval_datasets/convert_eval_datasets_to_jsonl.py21 symbols

deepspec/trainer/base_trainer.py19 symbols

deepspec/modeling/dspark/markov_head.py19 symbols

deepspec/modeling/eagle3/gemma4/modeling.py18 symbols

deepspec/modeling/eagle3/qwen3/modeling.py17 symbols

deepspec/modeling/dspark/qwen3/modeling.py17 symbols

deepspec/modeling/dspark/gemma4/modeling.py17 symbols

deepspec/utils/optim.py16 symbols

scripts/data/generate_train_data.py14 symbols

Dependencies from manifests, versioned

PyYAML6.0.3 · 1×

datasets4.8.5 · 1×

matplotlib3.10.9 · 1×

numpy2.4.4 · 1×

openai2.6.1 · 1×

prettytable3.17.0 · 1×

safetensors0.7.0 · 1×

sentencepiece0.2.1 · 1×

tensorboard2.20.0 · 1×

torch2.9.1 · 1×

tqdm4.67.3 · 1×

transformers5.10.2 · 1×

For agents

$ claude mcp add DeepSpec \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact