MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / forward

Method forward

tensorrt_llm/quantization/layers.py:72–74  ·  view source on GitHub ↗
(self, x)

Source from the content-addressed store, hash-verified

70 self.axis = axis
71
72 def forward(self, x):
73 return quantize(x, self.scaling_factor.value, self.output_dtype,
74 self.axis)
75
76
77class QuantizePerToken(Module):

Callers 2

Calls 1

quantizeFunction · 0.70

Tested by 2