hub / github.com/InternLM/lmdeploy / pipeline

Function pipeline

lmdeploy/api.py:15–80 · view source on GitHub ↗

Create a pipeline for inference. Args: model_path: the path of a model. It could be one of the following options: - i) A local directory path of a turbomind model which is converted by ``lmdeploy convert`` command or download from ii) and iii).

(model_path: str,
             backend_config: TurbomindEngineConfig | PytorchEngineConfig | None = None,
             chat_template_config: ChatTemplateConfig | None = None,
             log_level: str = 'WARNING',
             max_log_len: int | None = None,
             trust_remote_code: bool = False,
             speculative_config: SpeculativeConfig | None = None,
             **kwargs)

Source from the content-addressed store, hash-verified

13
14
15	def pipeline(model_path: str,
16	backend_config: TurbomindEngineConfig \| PytorchEngineConfig \| None = None,
17	chat_template_config: ChatTemplateConfig \| None = None,
18	log_level: str = 'WARNING',
19	max_log_len: int \| None = None,
20	trust_remote_code: bool = False,
21	speculative_config: SpeculativeConfig \| None = None,
22	**kwargs):
23	"""Create a pipeline for inference.
24
25	Args:
26	model_path: the path of a model. It could be one of the following options:
27
28	- i) A local directory path of a turbomind model which is
29	converted by ``lmdeploy convert`` command or download from
30	ii) and iii).
31	- ii) The model_id of a lmdeploy-quantized model hosted
32	inside a model repo on huggingface.co, such as
33	``InternLM/internlm-chat-20b-4bit``,
34	``lmdeploy/llama2-chat-70b-4bit``, etc.
35	- iii) The model_id of a model hosted inside a model repo
36	on huggingface.co, such as ``internlm/internlm-chat-7b``,
37	``Qwen/Qwen-7B-Chat``, ``baichuan-inc/Baichuan2-7B-Chat``
38	and so on.
39	backend_config: backend config instance. Default to None.
40	chat_template_config: chat template configuration. Default to None.
41	log_level: set log level whose value among [``CRITICAL``, ``ERROR``,
42	``WARNING``, ``INFO``, ``DEBUG``]
43	max_log_len: Max number of prompt characters or prompt tokens
44	being printed in log.
45	trust_remote_code: whether to trust remote code from model repositories.
46	speculative_config: speculative decoding configuration.
47	**kwargs: additional keyword arguments passed to the pipeline.
48
49	Returns:
50	Pipeline: a pipeline instance for inference.
51
52	Examples:
53
54	.. code-block:: python
55
56	# LLM
57	import lmdeploy
58	pipe = lmdeploy.pipeline('internlm/internlm-chat-7b')
59	response = pipe(['hi','say this is a test'])
60	print(response)
61
62	# VLM
63	from lmdeploy.vl import load_image
64	from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
65	pipe = pipeline('liuhaotian/llava-v1.5-7b',
66	backend_config=TurbomindEngineConfig(session_len=8192),
67	chat_template_config=ChatTemplateConfig(model_name='vicuna'))
68	im = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
69	response = pipe([('describe this image', [im])])
70	print(response)
71	""" # noqa E501
72

Callers 15

build_pipeFunction · 0.90

run_pipeline_chat_testFunction · 0.90

run_pipeline_mllm_testFunction · 0.90

passkey_retrival_workerFunction · 0.90

init_pipelineFunction · 0.90

test_backend_config_validate_turbomindFunction · 0.90

__init__Method · 0.90

run_smoke_inferFunction · 0.90

mainFunction · 0.90

pipe_no_quantFunction · 0.90

pipe_quant_42Function · 0.90

Calls 1

PipelineClass · 0.85

Tested by 14

passkey_retrival_workerFunction · 0.72

init_pipelineFunction · 0.72

test_backend_config_validate_turbomindFunction · 0.72

run_smoke_inferFunction · 0.72

mainFunction · 0.72

pipe_no_quantFunction · 0.72

pipe_quant_42Function · 0.72

pipeMethod · 0.72

pipe_no_quantMethod · 0.72

pipe_quant_fp8Method · 0.72

test_guided_matrixFunction · 0.72