MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / constant

Function constant

tensorrt_llm/functional.py:1184–1217  ·  view source on GitHub ↗

Add a constant layer. TensorRT graphs encapsulate constant values in the form of constant layers (tensorrt.IConstantLayer). That function creates such a layer from a Numpy array of values. After compilation of the network by TensorRT, those weights are stored in the serialized

(ndarray: np.ndarray,
             as_dtype: trt.DataType | None = None,
             as_shape=None)

Source from the content-addressed store, hash-verified

1182
1183
1184def constant(ndarray: np.ndarray,
1185 as_dtype: trt.DataType | None = None,
1186 as_shape=None) -> Tensor:
1187 '''
1188 Add a constant layer.
1189
1190 TensorRT graphs encapsulate constant values in the form of constant layers
1191 (tensorrt.IConstantLayer). That function creates such a layer from a Numpy
1192 array of values. After compilation of the network by TensorRT, those
1193 weights are stored in the serialized TensorRT engine.
1194
1195 Parameters:
1196 ndarray : numpy.ndarray
1197 The array of values (weights) encapsulated by this constant layer.
1198
1199 Returns:
1200 The tensor produced by the inserted layer.
1201 '''
1202 trt_dtype = np_dtype_to_trt(ndarray.dtype) if as_dtype is None else as_dtype
1203 trt_shape = trt.Dims(
1204 ndarray.shape) if as_shape is None else trt.Dims(as_shape)
1205 trt_count = 1
1206 for i in range(len(trt_shape)):
1207 trt_count *= trt_shape[i]
1208 weights = trt.Weights(trt_dtype, ndarray.ctypes.data, trt_count)
1209 # Prevent underlying numpy array from going out of scope
1210 default_net().register_ndarray(ndarray)
1211 layer = default_trtnet().add_constant(trt_shape, weights)
1212 if not default_net().strongly_typed:
1213 layer.set_output_type(0, trt_dtype)
1214 tensor = _create_tensor(layer.get_output(0), layer)
1215 # TODO: remove this WAR after https://nvbugs/4359151 fixed.
1216 set_np_weight(default_trtnet(), layer.name, ndarray)
1217 return tensor
1218
1219
1220# TODO: TensorRT uses sizes of the output dimensions.

Callers 15

_run_matmulMethod · 0.90
_validate_draft_tokensFunction · 0.90
_get_draft_token_indicesFunction · 0.90
_get_draft_token_arrayFunction · 0.90
_get_maskFunction · 0.90
_gather_beamsFunction · 0.90
_beam_search_candidatesFunction · 0.90
gemm_swigluFunction · 0.85
sliceFunction · 0.85
padFunction · 0.85

Calls 7

np_dtype_to_trtFunction · 0.85
default_netFunction · 0.85
default_trtnetFunction · 0.85
_create_tensorFunction · 0.85
set_np_weightFunction · 0.85
register_ndarrayMethod · 0.80
get_outputMethod · 0.45

Tested by 1

_run_matmulMethod · 0.72