MCPcopy Index your code
hub / github.com/InternLM/lmdeploy / EngineOutput

Class EngineOutput

lmdeploy/messages.py:657–676  ·  view source on GitHub ↗

Engine output from turbomind/pytorch engine. Args: status: the response type. token_ids: the newly generated token ids in each iteration. logprobs: the top logprobs for each output position. cache_block_ids: send cache blocks back for migration in

Source from the content-addressed store, hash-verified

655
656@dataclass
657class EngineOutput:
658 """Engine output from turbomind/pytorch engine.
659
660 Args:
661 status: the response type.
662 token_ids: the newly generated token ids in each iteration.
663 logprobs: the top logprobs for each output
664 position.
665 cache_block_ids: send cache blocks back for migration in
666 Disaggregated LLM Serving when Prefill Engine is Done.
667 req_metrics: request metrics information
668 """
669 status: ResponseType
670 token_ids: list[int]
671 logprobs: list[dict[int, float]] = None
672 logits: torch.Tensor = None
673 last_hidden_state: torch.Tensor = None
674 cache_block_ids: list[int] | None = None
675 req_metrics: RequestMetrics | None = None
676 routed_experts: torch.Tensor = None
677
678
679@dataclass

Callers 10

async_stream_inferMethod · 0.90
_get_error_outputMethod · 0.90
generateMethod · 0.90
async_stream_inferMethod · 0.90
async_stream_inferMethod · 0.90
_stream_task_wrapperMethod · 0.90
getMethod · 0.90
_collective_rpc_asyncMethod · 0.90

Calls

no outgoing calls