MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / compaction

Function compaction

triton_kernels/compaction.py:10–48  ·  view source on GitHub ↗

Return compacted copies of *yv* and *yi* based on a per-row bitmask. Only the elements whose index appears among the active bits of *bitmask* are kept; the rest are replaced by *sentinel*. Kept elements preserve their original left-to-right order. Parameters ----------

(yv, yi, bitmask, sentinel=-1)

Source from the content-addressed store, hash-verified

8
9
10def compaction(yv, yi, bitmask, sentinel=-1):
11 """
12 Return compacted copies of *yv* and *yi* based on a per-row bitmask.
13
14 Only the elements whose index appears among the active bits of *bitmask*
15 are kept; the rest are replaced by *sentinel*. Kept elements preserve
16 their original left-to-right order.
17
18 Parameters
19 ----------
20 yv : torch.Tensor, shape (B, K)
21 Values tensor.
22 yi : torch.Tensor, shape (B, K), dtype torch.long
23 Integer indices (0 ≤ index < 32) associated with *yv*.
24 bitmask : torch.Tensor, shape (B,) **or** (B, 32)
25 Per-row mask of active indices. See the in-place version for details.
26 sentinel : int, default -1
27 Value written into dropped positions of the returned tensors.
28
29 Returns
30 -------
31 (yv_out, yi_out) : Tuple[torch.Tensor, torch.Tensor], each shape (B, K)
32 New tensors with the same dtype/device as the inputs.
33
34 """
35
36 n_rows, n_cols = yi.shape
37 ret_yv = torch.empty_like(yv)
38 ret_yi = torch.empty_like(yi)
39 if isinstance(bitmask, Bitmatrix):
40 bitmask = bitmask.storage.data
41
42 _masked_compaction[(n_rows, )](
43 yv, yi, bitmask, bitmask.stride(0), bitmask.stride(1), # inputs
44 ret_yv, ret_yi, # outputs
45 sentinel, # sentinel
46 K=n_cols # constants
47 )
48 return ret_yv, ret_yi
49
50
51def compaction_torch(yv: torch.Tensor, yi: torch.Tensor, bitmask: torch.Tensor, sentinel=-1):

Callers 2

forwardMethod · 0.90
prune_routing_epMethod · 0.90

Calls 1

strideMethod · 0.80

Tested by

no test coverage detected