hub / github.com/NVIDIA/TensorRT-LLM / mark_output

Method mark_output

tensorrt_llm/functional.py:299–317 · view source on GitHub ↗

Mark a tensor as a network output. When a tensor is marked as an output, its content can be obtained after the execution of the TensorRT engine. The user is responsible for allocating buffers to store the output tensors when preparing the execution of the Te

(self,
                    name: Optional[str] = None,
                    dtype: Optional[Union[str, trt.DataType]] = None)

Source from the content-addressed store, hash-verified

297	self.trt_tensor.location = location
298
299	def mark_output(self,
300	name: Optional[str] = None,
301	dtype: Optional[Union[str, trt.DataType]] = None):
302	'''
303	Mark a tensor as a network output.
304
305	When a tensor is marked as an output, its content can be obtained after
306	the execution of the TensorRT engine. The user is responsible for
307	allocating buffers to store the output tensors when preparing the
308	execution of the TensorRT engine.
309	'''
310	if name is None:
311	name = self.name
312
313	if isinstance(dtype, str):
314	dtype = str_dtype_to_trt(dtype)
315
316	assert dtype is None or isinstance(dtype, trt.DataType)
317	default_net()._mark_output(self, name, dtype)
318
319	def __add__(self, b):
320	'''

Callers 15

build_bertFunction · 0.80

_construct_executionMethod · 0.80

_sq_gemmMethod · 0.80

test_smooth_quant_rms_normMethod · 0.80

_fp8_rowwise_gemmMethod · 0.80

test_linear_smooth_quantMethod · 0.80

test_mlp_smooth_quantMethod · 0.80

test_smooth_quant_layer_norm_layerMethod · 0.80

test_linear_weight_only_linearMethod · 0.80

_construct_executionMethod · 0.80

test_quantize_per_tensorMethod · 0.80

Calls 3

str_dtype_to_trtFunction · 0.85

default_netFunction · 0.85

_mark_outputMethod · 0.80

Tested by 15

_construct_executionMethod · 0.64

_sq_gemmMethod · 0.64

test_smooth_quant_rms_normMethod · 0.64

_fp8_rowwise_gemmMethod · 0.64

test_linear_smooth_quantMethod · 0.64

test_mlp_smooth_quantMethod · 0.64

test_smooth_quant_layer_norm_layerMethod · 0.64

test_linear_weight_only_linearMethod · 0.64

_construct_executionMethod · 0.64

test_quantize_per_tensorMethod · 0.64

test_quantize_per_channelMethod · 0.64