MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / use_smooth_quant

Method use_smooth_quant

tensorrt_llm/quantization/mode.py:344–345  ·  view source on GitHub ↗
(per_token=False, per_channel=False)

Source from the content-addressed store, hash-verified

342
343 @staticmethod
344 def use_smooth_quant(per_token=False, per_channel=False):
345 return QuantMode.from_description(True, True, per_token, per_channel)
346
347 @staticmethod
348 def use_qserve(per_group):

Callers 6

test_from_descriptionMethod · 0.80
from_quant_algoMethod · 0.80
smooth_quant_layer_normFunction · 0.80
smooth_quant_rms_normFunction · 0.80
quantize_per_tokenFunction · 0.80

Calls 1

from_descriptionMethod · 0.80

Tested by 1

test_from_descriptionMethod · 0.64