MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / enqueue

Method enqueue

tensorrt_llm/python_plugin.py:399–416  ·  view source on GitHub ↗
(self, input_desc, output_desc, inputs, outputs, workspace, stream)

Source from the content-addressed store, hash-verified

397 )
398
399 def enqueue(self, input_desc, output_desc, inputs, outputs, workspace, stream):
400 torch_stream = torch.cuda.ExternalStream(stream_ptr=stream)
401 self.workspace = workspace
402 self.current_stream = stream
403
404 with torch.cuda.stream(torch_stream):
405 self.forward(
406 tuple(
407 TensorWrapper.from_trt_desc(input_desc[i], inputs[i])
408 for i in range(len(input_desc))
409 ),
410 tuple(
411 TensorWrapper.from_trt_desc(output_desc[i], outputs[i])
412 for i in range(len(output_desc))
413 ),
414 )
415
416 self.current_stream = -1
417
418 def __call__(self, *args: Union[Sequence[TensorWrapper], Sequence[torch.Tensor]]):
419 is_trtllm = True

Callers

nothing calls this directly

Calls 2

forwardMethod · 0.95
from_trt_descMethod · 0.80

Tested by

no test coverage detected