Method forward

tensorrt_llm/quantization/layers.py:2854–2870 · view source on GitHub ↗

(self, x)

Source from the content-addressed store, hash-verified

2852	self.register_parameter('scale_to_int', None)
2853
2854	def forward(self, x):
2855	weight = None if self.weight is None else self.weight.value
2856	bias = None if self.bias is None else self.bias.value
2857	scale = None if self.scale_to_int is None else self.scale_to_int.value
2858	clamp_val = None if self.clamp_val is None else self.clamp_val.value
2859	return smooth_quant_rms_norm(
2860	x,
2861	self.normalized_shape,
2862	weight,
2863	bias,
2864	scale,
2865	clamp_val,
2866	self.eps,
2867	dynamic_act_scaling=True,
2868	scale_dtype='float16',
2869	sum_per_token=not self.quant_mode.has_per_group_scaling(),
2870	sum_dtype='float16')
2871
2872
2873	# TODO: Mostly duplicates SmoothQuantMLP.

nothing calls this directly

smooth_quant_rms_normFunction · 0.85

has_per_group_scalingMethod · 0.80

no test coverage detected