Code
Hub
Workspaces
Connect
Indexed graphs
Engine
MCP
copy
Index your code
hub
/
github.com/NVIDIA/TensorRT-LLM
/ use_smooth_quant
Method
use_smooth_quant
tensorrt_llm/quantization/mode.py:344–345 ·
view source on GitHub ↗
(per_token=False, per_channel=False)
Source
from the content-addressed store, hash-verified
342
343
@staticmethod
344
def
use_smooth_quant(per_token=False, per_channel=False):
345
return
QuantMode.from_description(True, True, per_token, per_channel)
346
347
@staticmethod
348
def
use_qserve(per_group):
Callers
6
test_from_description
Method · 0.80
convert_from_hf_checkpoint
Function · 0.80
from_quant_algo
Method · 0.80
smooth_quant_layer_norm
Function · 0.80
smooth_quant_rms_norm
Function · 0.80
quantize_per_token
Function · 0.80
Calls
1
from_description
Method · 0.80
Tested by
1
test_from_description
Method · 0.64