MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / smooth_quantize_ootb

Function smooth_quantize_ootb

tensorrt_llm/quantization/quantize.py:170–184  ·  view source on GitHub ↗
(
    model,
    quant_config: QuantConfig,
)

Source from the content-addressed store, hash-verified

168
169
170def smooth_quantize_ootb(
171 model,
172 quant_config: QuantConfig,
173):
174 quant_map = {
175 ColumnLinear: Int8SmoothQuantLinear,
176 RowLinear: Int8SmoothQuantRowLinear,
177 }
178
179 model = quantize_layers(
180 model,
181 quant_config,
182 quant_map,
183 )
184 return model
185
186
187def smooth_quantize_plugin(model, quant_mode):

Callers 1

smooth_quantizeFunction · 0.85

Calls 1

quantize_layersFunction · 0.85

Tested by

no test coverage detected