hub / github.com/NVIDIA/TensorRT-LLM / slice

Function slice

tensorrt_llm/functional.py:1222–1324 · view source on GitHub ↗

Add an operation to extract a slice from a tensor. As described in the TensorRT documentation of the ISliceLayer, the slice layer has two variants: Static and dynamic. For static slicing, this function takes the starts and sizes values in the different dimensions to slice at l

(input: Tensor,
          starts: Union[Tensor, Sequence[int]],
          sizes: Union[Tensor, Sequence[int]],
          strides: Union[Tensor, Sequence[int]] = None,
          mode: trt.SampleMode = None,
          fill_value: Union[float, Tensor] = None)

Source from the content-addressed store, hash-verified

1220	# TODO: TensorRT uses sizes of the output dimensions.
1221	# DL framework uses ends usually. Will change it to ends.
1222	def slice(input: Tensor,
1223	starts: Union[Tensor, Sequence[int]],
1224	sizes: Union[Tensor, Sequence[int]],
1225	strides: Union[Tensor, Sequence[int]] = None,
1226	mode: trt.SampleMode = None,
1227	fill_value: Union[float, Tensor] = None) -> Tensor:
1228	'''
1229	Add an operation to extract a slice from a tensor.
1230
1231	As described in the TensorRT documentation of the ISliceLayer, the slice
1232	layer has two variants: Static and dynamic.
1233
1234	For static slicing, this function takes the starts and sizes values in the
1235	different dimensions to slice at layer creation time via a sequence of
1236	integers. For dynamic slicing, it accepts starts and sizes as
1237	tensorrt.ITensor`s.
1238
1239	The slice layer selects for each dimension a start location from within the
1240	input tensor, and copies elements to the output tensor using a stride of 1
1241	across the input tensor. Start and size tensors must be 1-D int32 shape
1242	tensors if not specified as a sequence of integers.
1243
1244	As an example, on input = [[0, 2, 4], [1, 3, 5]], the call to
1245
1246	slice(input, start=[1, 0], size=[1, 2])
1247
1248	will produce the tensor [[1, 3]] as output. The slice operator when
1249	executed by TensorRT will copy one row (because size[0] == 1) starting from
1250	the 2nd row (because start[0] == 1) and two columns (size[1] == 2) starting
1251	from the 1st column (because start[1] == 0).
1252
1253	In pseudo-code the behavior of that operation can be described as follows
1254	for a 2D tensor (and easily be extended to more dimensions):
1255
1256	output = Tensor(shape=sizes)
1257	for ii in range(sizes[0]):
1258	for jj in range(sizes[1]):
1259	output[ii][jj] = input[starts[0]+ii][starts[1]+jj]
1260
1261	Note that it is common in deep-learning frameworks to use ranges
1262	[start:end] for similar operations. It can be emulated by setting the sizes
1263	argument such that in each dimension [start:start+size] == [start:end] i.e.
1264	size = end-start.
1265
1266	TensorRT supports different slice modes but that function restricts that
1267	choice to `mode == tensorrt.SampleMode.STRICT_BOUNDS`.
1268
1269	Parameters:
1270	input : Tensor
1271	The input tensor on which the slicing is performed.
1272
1273	starts : Union[Tensor, Sequence[int]]
1274	The starting points, in the input tensor, and each dimension.
1275
1276	sizes : Union[Tensor, Sequence[int]]
1277	The number of elements in each dimension of the sliced tensor (output).
1278
1279	strides : Union[Tensor, Sequence[int]]

Callers 15

_validate_draft_tokensFunction · 0.90

_get_draft_token_arrayFunction · 0.90

warp_logitsFunction · 0.90

unswizzle_dataMethod · 0.85

_copy_to_cpFunction · 0.85

embeddingFunction · 0.85

gegeluFunction · 0.85

splitFunction · 0.85

allgatherFunction · 0.85

rotate_every_twoMethod · 0.85

rotate_halfMethod · 0.85

apply_rotary_pos_embMethod · 0.85

Calls 5

constantFunction · 0.85

default_trtnetFunction · 0.85

_create_tensorFunction · 0.85

ndimMethod · 0.45

get_outputMethod · 0.45

Tested by 1

_copy_to_cpFunction · 0.68