MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / slice

Function slice

tensorrt_llm/functional.py:1222–1324  ·  view source on GitHub ↗

Add an operation to extract a slice from a tensor. As described in the TensorRT documentation of the ISliceLayer, the slice layer has two variants: Static and dynamic. For static slicing, this function takes the starts and sizes values in the different dimensions to slice at l

(input: Tensor,
          starts: Union[Tensor, Sequence[int]],
          sizes: Union[Tensor, Sequence[int]],
          strides: Union[Tensor, Sequence[int]] = None,
          mode: trt.SampleMode = None,
          fill_value: Union[float, Tensor] = None)

Source from the content-addressed store, hash-verified

1220# TODO: TensorRT uses sizes of the output dimensions.
1221# DL framework uses ends usually. Will change it to ends.
1222def slice(input: Tensor,
1223 starts: Union[Tensor, Sequence[int]],
1224 sizes: Union[Tensor, Sequence[int]],
1225 strides: Union[Tensor, Sequence[int]] = None,
1226 mode: trt.SampleMode = None,
1227 fill_value: Union[float, Tensor] = None) -> Tensor:
1228 '''
1229 Add an operation to extract a slice from a tensor.
1230
1231 As described in the TensorRT documentation of the ISliceLayer, the slice
1232 layer has two variants: Static and dynamic.
1233
1234 For static slicing, this function takes the starts and sizes values in the
1235 different dimensions to slice at layer creation time via a sequence of
1236 integers. For dynamic slicing, it accepts starts and sizes as
1237 tensorrt.ITensor`s.
1238
1239 The slice layer selects for each dimension a start location from within the
1240 input tensor, and copies elements to the output tensor using a stride of 1
1241 across the input tensor. Start and size tensors must be 1-D int32 shape
1242 tensors if not specified as a sequence of integers.
1243
1244 As an example, on input = [[0, 2, 4], [1, 3, 5]], the call to
1245
1246 slice(input, start=[1, 0], size=[1, 2])
1247
1248 will produce the tensor [[1, 3]] as output. The slice operator when
1249 executed by TensorRT will copy one row (because size[0] == 1) starting from
1250 the 2nd row (because start[0] == 1) and two columns (size[1] == 2) starting
1251 from the 1st column (because start[1] == 0).
1252
1253 In pseudo-code the behavior of that operation can be described as follows
1254 for a 2D tensor (and easily be extended to more dimensions):
1255
1256 output = Tensor(shape=sizes)
1257 for ii in range(sizes[0]):
1258 for jj in range(sizes[1]):
1259 output[ii][jj] = input[starts[0]+ii][starts[1]+jj]
1260
1261 Note that it is common in deep-learning frameworks to use ranges
1262 [start:end] for similar operations. It can be emulated by setting the sizes
1263 argument such that in each dimension [start:start+size] == [start:end] i.e.
1264 size = end-start.
1265
1266 TensorRT supports different slice modes but that function restricts that
1267 choice to `mode == tensorrt.SampleMode.STRICT_BOUNDS`.
1268
1269 Parameters:
1270 input : Tensor
1271 The input tensor on which the slicing is performed.
1272
1273 starts : Union[Tensor, Sequence[int]]
1274 The starting points, in the input tensor, and each dimension.
1275
1276 sizes : Union[Tensor, Sequence[int]]
1277 The number of elements in each dimension of the sliced tensor (output).
1278
1279 strides : Union[Tensor, Sequence[int]]

Callers 15

_validate_draft_tokensFunction · 0.90
_get_draft_token_arrayFunction · 0.90
warp_logitsFunction · 0.90
unswizzle_dataMethod · 0.85
_copy_to_cpFunction · 0.85
embeddingFunction · 0.85
gegeluFunction · 0.85
splitFunction · 0.85
allgatherFunction · 0.85
rotate_every_twoMethod · 0.85
rotate_halfMethod · 0.85
apply_rotary_pos_embMethod · 0.85

Calls 5

constantFunction · 0.85
default_trtnetFunction · 0.85
_create_tensorFunction · 0.85
ndimMethod · 0.45
get_outputMethod · 0.45

Tested by 1

_copy_to_cpFunction · 0.68