hub / github.com/NVIDIA/TensorRT-LLM / command

Method command

tensorrt_llm/evaluate/mmlu.py:307–333 · view source on GitHub ↗

(ctx, dataset_path: Optional[str], num_samples: int,
                num_fewshot: int, random_seed: int, apply_chat_template: bool,
                chat_template_kwargs: Optional[dict[str, Any]],
                system_prompt: Optional[str], max_input_length: int,
                max_output_length: int, check_accuracy: bool,
                accuracy_threshold: float)

Source from the content-addressed store, hash-verified

305	@click.pass_context
306	@staticmethod
307	def command(ctx, dataset_path: Optional[str], num_samples: int,
308	num_fewshot: int, random_seed: int, apply_chat_template: bool,
309	chat_template_kwargs: Optional[dict[str, Any]],
310	system_prompt: Optional[str], max_input_length: int,
311	max_output_length: int, check_accuracy: bool,
312	accuracy_threshold: float) -> None:
313	llm: Union[LLM, PyTorchLLM] = ctx.obj
314	sampling_params = SamplingParams(
315	max_tokens=max_output_length,
316	truncate_prompt_tokens=max_input_length)
317	evaluator = MMLU(dataset_path,
318	num_samples=num_samples,
319	num_fewshot=num_fewshot,
320	random_seed=random_seed,
321	apply_chat_template=apply_chat_template,
322	system_prompt=system_prompt,
323	chat_template_kwargs=chat_template_kwargs)
324	accuracy = evaluator.evaluate(llm, sampling_params)
325	llm.shutdown()
326
327	if check_accuracy:
328	logger.warning(
329	"The --check_accuracy flag is not expected to be used anymore. "
330	"It is being used by some legacy accuracy tests that call evaluation commands via subprocess. "
331	"New accuracy tests should use LLM API within the pytest process; please see `tests/integration/defs/accuracy/README.md`."
332	)
333	assert accuracy >= accuracy_threshold, f"Expected accuracy >= {accuracy_threshold}, but got {accuracy}."

Callers

nothing calls this directly

Calls 5

SamplingParamsClass · 0.85

MMLUClass · 0.70

evaluateMethod · 0.45

shutdownMethod · 0.45

warningMethod · 0.45

Tested by

no test coverage detected