MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / forward

Method forward

tensorrt_llm/quantization/layers.py:555–568  ·  view source on GitHub ↗
(self, x)

Source from the content-addressed store, hash-verified

553 self.quant_mode = quant_mode
554
555 def forward(self, x):
556 weight = None if self.weight is None else self.weight.value
557 bias = None if self.bias is None else self.bias.value
558 scale = None
559 clamp_val = None if self.clamp_val is None else self.clamp_val.value
560 return fp8_rowwise_rms_norm(
561 x,
562 self.normalized_shape,
563 weight,
564 bias,
565 scale,
566 clamp_val,
567 self.eps,
568 dynamic_act_scaling=self.quant_mode.has_fp8_rowwise())
569
570
571class Fp8RowwiseLinear(Linear):

Callers

nothing calls this directly

Calls 2

fp8_rowwise_rms_normFunction · 0.85
has_fp8_rowwiseMethod · 0.45

Tested by

no test coverage detected