hub / github.com/InternLM/lmdeploy / EngineOutput

Class EngineOutput

lmdeploy/messages.py:657–676 · view source on GitHub ↗

Engine output from turbomind/pytorch engine. Args: status: the response type. token_ids: the newly generated token ids in each iteration. logprobs: the top logprobs for each output position. cache_block_ids: send cache blocks back for migration in

Source from the content-addressed store, hash-verified

655
656	@dataclass
657	class EngineOutput:
658	"""Engine output from turbomind/pytorch engine.
659
660	Args:
661	status: the response type.
662	token_ids: the newly generated token ids in each iteration.
663	logprobs: the top logprobs for each output
664	position.
665	cache_block_ids: send cache blocks back for migration in
666	Disaggregated LLM Serving when Prefill Engine is Done.
667	req_metrics: request metrics information
668	"""
669	status: ResponseType
670	token_ids: list[int]
671	logprobs: list[dict[int, float]] = None
672	logits: torch.Tensor = None
673	last_hidden_state: torch.Tensor = None
674	cache_block_ids: list[int] \| None = None
675	req_metrics: RequestMetrics \| None = None
676	routed_experts: torch.Tensor = None
677
678
679	@dataclass

Callers 10

async_stream_inferMethod · 0.90

_get_error_outputMethod · 0.90

generateMethod · 0.90

async_stream_inferMethod · 0.90

_stream_task_wrapperMethod · 0.90

getMethod · 0.90

_collective_rpc_streaming_asyncMethod · 0.90

_collective_rpc_asyncMethod · 0.90

_async_test_ray_get_stream_task_result_after_drop_is_idempotentFunction · 0.90

Calls

no outgoing calls

Tested by 3

_collective_rpc_streaming_asyncMethod · 0.72

_collective_rpc_asyncMethod · 0.72

_async_test_ray_get_stream_task_result_after_drop_is_idempotentFunction · 0.72