Code
Hub
Workspaces
Connect
Indexed graphs
Engine
MCP
copy
Index your code
hub
/
github.com/NVIDIA/TensorRT-LLM
/ has_fp8_kv_cache
Method
has_fp8_kv_cache
tensorrt_llm/quantization/mode.py:167–168 ·
view source on GitHub ↗
(self)
Source
from the content-addressed store, hash-verified
165
return
self._any(self.INT8_KV_CACHE)
166
167
def
has_fp8_kv_cache(self):
168
return
self._any(self.FP8_KV_CACHE)
169
170
def
has_fp4_kv_cache(self):
171
return
self._any(self.NVFP4_KV_CACHE)
Callers
15
has_kv_cache_quant
Method · 0.95
to_dict
Method · 0.95
create_builder_config
Method · 0.80
build
Function · 0.80
update_quant_config
Method · 0.80
forward
Method · 0.80
update_quant_config
Method · 0.80
_single_request_preprocess_inputs
Method · 0.80
get_cache_size_per_token
Method · 0.80
get_cache_size_per_token
Method · 0.80
__init__
Method · 0.80
_create_kv_cache_manager
Function · 0.80
Calls
1
_any
Method · 0.95
Tested by
no test coverage detected