hub / github.com/NVIDIA/TensorRT-LLM / from_model_config_cpp

Method from_model_config_cpp

tensorrt_llm/runtime/generation.py:659–691 · view source on GitHub ↗

Create a partially initialized ModelConfig instance from a given ModelConfig CPP binding instance. Note that each of these classes have fields that don't exist in the other, so the created ModelConfigPython won't have all of its fields initialized.

(cls, model_config_cpp)

Source from the content-addressed store, hash-verified

657
658	@classmethod
659	def from_model_config_cpp(cls, model_config_cpp) -> 'ModelConfig':
660	"""Create a partially initialized ModelConfig instance from a given ModelConfig CPP binding instance.
661
662	Note that each of these classes have fields that don't exist in the other, so the created ModelConfigPython
663	won't have all of its fields initialized.
664	"""
665	return cls(
666	max_batch_size=model_config_cpp.max_batch_size,
667	max_beam_width=model_config_cpp.max_beam_width,
668	vocab_size=model_config_cpp.vocab_size,
669	num_layers=model_config_cpp.num_layers(),
670	num_heads=model_config_cpp.num_heads,
671	num_kv_heads=model_config_cpp.num_kv_heads(0),
672	hidden_size=model_config_cpp.hidden_size,
673	remove_input_padding=model_config_cpp.use_packed_input,
674	kv_cache_type=model_config_cpp.kv_cache_type,
675	cross_attention=model_config_cpp.use_cross_attention,
676	head_size=model_config_cpp.head_size,
677	max_prompt_embedding_table_size=model_config_cpp.
678	max_prompt_embedding_table_size,
679	quant_mode=QuantMode(model_config_cpp.quant_mode.value),
680	gather_context_logits=model_config_cpp.compute_context_logits,
681	gather_generation_logits=model_config_cpp.compute_generation_logits,
682	gpt_attention_plugin=model_config_cpp.use_gpt_attention_plugin,
683	dtype=binding_to_str_dtype(model_config_cpp.data_type),
684	num_kv_heads_per_layer=model_config_cpp.num_kv_heads_per_layer,
685	tokens_per_block=model_config_cpp.tokens_per_block,
686	lora_plugin=model_config_cpp.use_lora_plugin,
687	layer_types=[
688	binding_layer_type_to_str(lt)
689	for lt in model_config_cpp.layer_types
690	],
691	)
692
693
694	@dataclass

Callers 2

__init__Method · 0.80

from_dirMethod · 0.80

Calls 4

QuantModeClass · 0.85

binding_to_str_dtypeFunction · 0.85

binding_layer_type_to_strFunction · 0.85

num_layersMethod · 0.45

Tested by

no test coverage detected