MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / extract_layer_idx

Function extract_layer_idx

tensorrt_llm/quantization/quantize.py:267–272  ·  view source on GitHub ↗
(name)

Source from the content-addressed store, hash-verified

265 set(exclude_modules + ['*ln_f', '*ln_embed', '*lm_head']))
266
267 def extract_layer_idx(name):
268 ss = name.split('.')
269 for s in ss:
270 if s.isdigit():
271 return int(s)
272 return None
273
274 # Meta's LLaMA 3.1 recipe:
275 # (1) Skip quantization for the first and last Transformer layers

Callers 1

fp8_rowwise_quantizeFunction · 0.70

Calls 1

splitMethod · 0.45

Tested by

no test coverage detected