MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / has_kv_cache_quant

Method has_kv_cache_quant

tensorrt_llm/quantization/mode.py:173–175  ·  view source on GitHub ↗
(self)

Source from the content-addressed store, hash-verified

171 return self._any(self.NVFP4_KV_CACHE)
172
173 def has_kv_cache_quant(self):
174 return (self.has_int8_kv_cache() or self.has_fp8_kv_cache()
175 or self.has_fp4_kv_cache())
176
177 def has_fp8_qdq(self):
178 return self._any(self.FP8_QDQ)

Callers 6

gpt_attentionFunction · 0.80
quantizeFunction · 0.80
forwardMethod · 0.80
setupMethod · 0.80
decodeMethod · 0.80

Calls 3

has_int8_kv_cacheMethod · 0.95
has_fp8_kv_cacheMethod · 0.95
has_fp4_kv_cacheMethod · 0.95

Tested by

no test coverage detected