MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / from_model_config_cpp

Method from_model_config_cpp

tensorrt_llm/runtime/generation.py:659–691  ·  view source on GitHub ↗

Create a partially initialized ModelConfig instance from a given ModelConfig CPP binding instance. Note that each of these classes have fields that don't exist in the other, so the created ModelConfigPython won't have all of its fields initialized.

(cls, model_config_cpp)

Source from the content-addressed store, hash-verified

657
658 @classmethod
659 def from_model_config_cpp(cls, model_config_cpp) -> 'ModelConfig':
660 """Create a partially initialized ModelConfig instance from a given ModelConfig CPP binding instance.
661
662 Note that each of these classes have fields that don't exist in the other, so the created ModelConfigPython
663 won't have all of its fields initialized.
664 """
665 return cls(
666 max_batch_size=model_config_cpp.max_batch_size,
667 max_beam_width=model_config_cpp.max_beam_width,
668 vocab_size=model_config_cpp.vocab_size,
669 num_layers=model_config_cpp.num_layers(),
670 num_heads=model_config_cpp.num_heads,
671 num_kv_heads=model_config_cpp.num_kv_heads(0),
672 hidden_size=model_config_cpp.hidden_size,
673 remove_input_padding=model_config_cpp.use_packed_input,
674 kv_cache_type=model_config_cpp.kv_cache_type,
675 cross_attention=model_config_cpp.use_cross_attention,
676 head_size=model_config_cpp.head_size,
677 max_prompt_embedding_table_size=model_config_cpp.
678 max_prompt_embedding_table_size,
679 quant_mode=QuantMode(model_config_cpp.quant_mode.value),
680 gather_context_logits=model_config_cpp.compute_context_logits,
681 gather_generation_logits=model_config_cpp.compute_generation_logits,
682 gpt_attention_plugin=model_config_cpp.use_gpt_attention_plugin,
683 dtype=binding_to_str_dtype(model_config_cpp.data_type),
684 num_kv_heads_per_layer=model_config_cpp.num_kv_heads_per_layer,
685 tokens_per_block=model_config_cpp.tokens_per_block,
686 lora_plugin=model_config_cpp.use_lora_plugin,
687 layer_types=[
688 binding_layer_type_to_str(lt)
689 for lt in model_config_cpp.layer_types
690 ],
691 )
692
693
694@dataclass

Callers 2

__init__Method · 0.80
from_dirMethod · 0.80

Calls 4

QuantModeClass · 0.85
binding_to_str_dtypeFunction · 0.85
num_layersMethod · 0.45

Tested by

no test coverage detected