MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / run_MTP

Function run_MTP

examples/llm-api/llm_speculative_decoding.py:18–32  ·  view source on GitHub ↗
(model: Optional[str] = None)

Source from the content-addressed store, hash-verified

16
17
18def run_MTP(model: Optional[str] = None):
19 spec_config = MTPDecodingConfig(num_nextn_predict_layers=1,
20 use_relaxed_acceptance_for_thinking=True,
21 relaxed_topk=10,
22 relaxed_delta=0.01)
23
24 llm = LLM(
25 # You can change this to a local model path if you have the model downloaded
26 model=model or "nvidia/DeepSeek-R1-FP4",
27 speculative_config=spec_config,
28 )
29
30 for prompt in prompts:
31 response = llm.generate(prompt, SamplingParams(max_tokens=10))
32 print(response.outputs[0].text)
33
34
35def run_Eagle3():

Callers 1

mainFunction · 0.85

Calls 4

MTPDecodingConfigClass · 0.90
LLMClass · 0.90
SamplingParamsClass · 0.90
generateMethod · 0.45

Tested by

no test coverage detected