MCPcopy
hub / github.com/dask/dask / make_block_sorted_slices

Function make_block_sorted_slices

dask/array/slicing.py:1157–1208  ·  view source on GitHub ↗

Generate blockwise-sorted index pairs for shuffling an array. Parameters ---------- index : ndarray An array of index positions. chunks : tuple Chunks from the original dask array Returns ------- index2 : ndarray Same values as `index`, but each

(index, chunks)

Source from the content-addressed store, hash-verified

1155
1156
1157def make_block_sorted_slices(index, chunks):
1158 """Generate blockwise-sorted index pairs for shuffling an array.
1159
1160 Parameters
1161 ----------
1162 index : ndarray
1163 An array of index positions.
1164 chunks : tuple
1165 Chunks from the original dask array
1166
1167 Returns
1168 -------
1169 index2 : ndarray
1170 Same values as `index`, but each block has been sorted
1171 index3 : ndarray
1172 The location of the values of `index` in `index2`
1173
1174 Examples
1175 --------
1176 >>> index = np.array([6, 0, 4, 2, 7, 1, 5, 3])
1177 >>> chunks = ((4, 4),)
1178 >>> a, b = make_block_sorted_slices(index, chunks)
1179
1180 Notice that the first set of 4 items are sorted, and the
1181 second set of 4 items are sorted.
1182
1183 >>> a
1184 array([0, 2, 4, 6, 1, 3, 5, 7])
1185 >>> b
1186 array([3, 0, 2, 1, 7, 4, 6, 5])
1187 """
1188 from dask.array.core import slices_from_chunks
1189
1190 slices = slices_from_chunks(chunks)
1191
1192 if len(slices[0]) > 1:
1193 slices = [slice_[0] for slice_ in slices]
1194
1195 offsets = np.roll(np.cumsum(chunks[0]), 1)
1196 offsets[0] = 0
1197
1198 index2 = np.empty_like(index)
1199 index3 = np.empty_like(index)
1200
1201 for slice_, offset in zip(slices, offsets):
1202 a = index[slice_]
1203 b = np.sort(a)
1204 c = offset + np.argsort(b.take(np.argsort(a)))
1205 index2[slice_] = b
1206 index3[slice_] = c
1207
1208 return index2, index3
1209
1210
1211def shuffle_slice(x, index):

Callers 2

shuffle_sliceFunction · 0.85

Calls 3

slices_from_chunksFunction · 0.90
takeMethod · 0.80
cumsumMethod · 0.45

Tested by 1

Used in the wild real call sites across dependent graphs

searching dependent graphs…