MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / test_CachedModelLoader

Function test_CachedModelLoader

tests/unittest/llmapi/test_llm_utils.py:42–54  ·  view source on GitHub ↗
()

Source from the content-addressed store, hash-verified

40
41
42def test_CachedModelLoader():
43 # CachedModelLoader enables engine caching and multi-gpu building
44 args = TrtLlmArgs(
45 model=llama_model_path,
46 kv_cache_config=KvCacheConfig(free_gpu_memory_fraction=0.4),
47 enable_build_cache=True)
48 stats = LlmBuildStats()
49 model_loader = CachedModelLoader(args, llm_build_stats=stats)
50 engine_dir, _ = model_loader()
51 assert engine_dir
52 assert engine_dir.exists() and engine_dir.is_dir()
53 model_format = get_model_format(engine_dir, trust_remote_code=True)
54 assert model_format is _ModelFormatKind.TLLM_ENGINE
55
56
57def test_LlmArgs_default_gpus_per_node():

Callers

nothing calls this directly

Calls 5

TrtLlmArgsClass · 0.85
KvCacheConfigClass · 0.85
LlmBuildStatsClass · 0.85
CachedModelLoaderClass · 0.85
get_model_formatFunction · 0.85

Tested by

no test coverage detected