Method has_kv_cache_quant

tensorrt_llm/quantization/mode.py:173–175 · view source on GitHub ↗

(self)

Source from the content-addressed store, hash-verified

171	return self._any(self.NVFP4_KV_CACHE)
172
173	def has_kv_cache_quant(self):
174	return (self.has_int8_kv_cache() or self.has_fp8_kv_cache()
175	or self.has_fp4_kv_cache())
176
177	def has_fp8_qdq(self):
178	return self._any(self.FP8_QDQ)

gpt_attentionFunction · 0.80

get_cache_size_per_tokenMethod · 0.80

quantizeFunction · 0.80

forwardMethod · 0.80

setupMethod · 0.80

decodeMethod · 0.80

has_int8_kv_cacheMethod · 0.95

has_fp8_kv_cacheMethod · 0.95

has_fp4_kv_cacheMethod · 0.95

no test coverage detected