MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / forward

Method forward

tensorrt_llm/quantization/layers.py:933–942  ·  view source on GitHub ↗
(self, x)

Source from the content-addressed store, hash-verified

931 dtype="int8")
932
933 def forward(self, x):
934 result = embedding(x,
935 self.weight.value,
936 tp_size=self.tp_size,
937 tp_group=self.tp_group,
938 sharding_dim=self.sharding_dim,
939 tp_rank=self.tp_rank,
940 per_token_scale=self.per_token_scale.value)
941
942 return result
943
944
945def unpack_int32_into_int8(w_packed):

Callers

nothing calls this directly

Calls 1

embeddingFunction · 0.85

Tested by

no test coverage detected