hub / github.com/NVIDIA/TensorRT-LLM / constant

Function constant

tensorrt_llm/functional.py:1184–1217 · view source on GitHub ↗

Add a constant layer. TensorRT graphs encapsulate constant values in the form of constant layers (tensorrt.IConstantLayer). That function creates such a layer from a Numpy array of values. After compilation of the network by TensorRT, those weights are stored in the serialized

(ndarray: np.ndarray,
             as_dtype: trt.DataType | None = None,
             as_shape=None)

Source from the content-addressed store, hash-verified

1182
1183
1184	def constant(ndarray: np.ndarray,
1185	as_dtype: trt.DataType \| None = None,
1186	as_shape=None) -> Tensor:
1187	'''
1188	Add a constant layer.
1189
1190	TensorRT graphs encapsulate constant values in the form of constant layers
1191	(tensorrt.IConstantLayer). That function creates such a layer from a Numpy
1192	array of values. After compilation of the network by TensorRT, those
1193	weights are stored in the serialized TensorRT engine.
1194
1195	Parameters:
1196	ndarray : numpy.ndarray
1197	The array of values (weights) encapsulated by this constant layer.
1198
1199	Returns:
1200	The tensor produced by the inserted layer.
1201	'''
1202	trt_dtype = np_dtype_to_trt(ndarray.dtype) if as_dtype is None else as_dtype
1203	trt_shape = trt.Dims(
1204	ndarray.shape) if as_shape is None else trt.Dims(as_shape)
1205	trt_count = 1
1206	for i in range(len(trt_shape)):
1207	trt_count *= trt_shape[i]
1208	weights = trt.Weights(trt_dtype, ndarray.ctypes.data, trt_count)
1209	# Prevent underlying numpy array from going out of scope
1210	default_net().register_ndarray(ndarray)
1211	layer = default_trtnet().add_constant(trt_shape, weights)
1212	if not default_net().strongly_typed:
1213	layer.set_output_type(0, trt_dtype)
1214	tensor = _create_tensor(layer.get_output(0), layer)
1215	# TODO: remove this WAR after https://nvbugs/4359151 fixed.
1216	set_np_weight(default_trtnet(), layer.name, ndarray)
1217	return tensor
1218
1219
1220	# TODO: TensorRT uses sizes of the output dimensions.

Callers 15

_run_matmulMethod · 0.90

_validate_draft_tokensFunction · 0.90

_get_prefix_match_indicesFunction · 0.90

_get_draft_token_indicesFunction · 0.90

_get_draft_token_arrayFunction · 0.90

_get_maskFunction · 0.90

_get_indices_for_gather_beamsFunction · 0.90

_gather_beamsFunction · 0.90

_beam_search_candidatesFunction · 0.90

gemm_swigluFunction · 0.85

sliceFunction · 0.85

padFunction · 0.85

Calls 7

np_dtype_to_trtFunction · 0.85

default_netFunction · 0.85

default_trtnetFunction · 0.85

_create_tensorFunction · 0.85

set_np_weightFunction · 0.85

register_ndarrayMethod · 0.80

get_outputMethod · 0.45

Tested by 1

_run_matmulMethod · 0.72