MCPcopy Index your code
hub / github.com/InternLM/lmdeploy / pipe_quant_42

Function pipe_quant_42

tests/test_lmdeploy/test_quant_policy.py:58–76  ·  view source on GitHub ↗

Create pipeline with quant_policy=QuantPolicy.TURBO_QUANT. This fixture has class scope so large model instances are released before later FP8 accuracy tests allocate their own pipelines.

(model_id)

Source from the content-addressed store, hash-verified

56
57@pytest.fixture(scope='class')
58def pipe_quant_42(model_id):
59 """Create pipeline with quant_policy=QuantPolicy.TURBO_QUANT.
60
61 This fixture has class scope so large model instances are released before later FP8 accuracy tests allocate their
62 own pipelines.
63 """
64 engine_config = PytorchEngineConfig(
65 tp=1,
66 cache_max_entry_count=0.05,
67 quant_policy=QuantPolicy.TURBO_QUANT, # K=4bit, V=2bit mixed precision
68 )
69 pipe = pipeline(model_id, backend_config=engine_config, log_level='INFO')
70 yield pipe
71 # Cleanup
72 pipe.close()
73 del pipe
74 gc.collect()
75 if torch.cuda.is_available() and torch.cuda.device_count() > 0:
76 torch.cuda.empty_cache()
77
78
79# =============================================================================

Callers

nothing calls this directly

Calls 4

PytorchEngineConfigClass · 0.90
pipelineFunction · 0.90
closeMethod · 0.45
device_countMethod · 0.45

Tested by

no test coverage detected