MCPcopy
hub / github.com/dask/dask / rechunk

Function rechunk

dask/array/rechunk.py:270–390  ·  view source on GitHub ↗

Convert blocks in dask array x for new chunks. Parameters ---------- x: dask array Array to be rechunked. chunks: int, tuple, dict or str, optional The new block dimensions to create. -1 indicates the full size of the corresponding dimension. Default is

(
    x,
    chunks="auto",
    threshold=None,
    block_size_limit=None,
    balance=False,
    method=None,
)

Source from the content-addressed store, hash-verified

268
269
270def rechunk(
271 x,
272 chunks="auto",
273 threshold=None,
274 block_size_limit=None,
275 balance=False,
276 method=None,
277):
278 """
279 Convert blocks in dask array x for new chunks.
280
281 Parameters
282 ----------
283 x: dask array
284 Array to be rechunked.
285 chunks: int, tuple, dict or str, optional
286 The new block dimensions to create. -1 indicates the full size of the
287 corresponding dimension. Default is "auto" which automatically
288 determines chunk sizes.
289 threshold: int, optional
290 The graph growth factor under which we don't bother introducing an
291 intermediate step.
292 block_size_limit: int, optional
293 The maximum block size (in bytes) we want to produce
294 Defaults to the configuration value ``array.chunk-size``
295 balance : bool, default False
296 If True, try to make each chunk to be the same size.
297
298 This means ``balance=True`` will remove any small leftover chunks, so
299 using ``x.rechunk(chunks=len(x) // N, balance=True)``
300 will almost certainly result in ``N`` chunks.
301 method: {'tasks', 'p2p'}, optional.
302 Rechunking method to use.
303
304
305 Examples
306 --------
307 >>> import dask.array as da
308 >>> x = da.ones((1000, 1000), chunks=(100, 100))
309
310 Specify uniform chunk sizes with a tuple
311
312 >>> y = x.rechunk((1000, 10))
313
314 Or chunk only specific dimensions with a dictionary
315
316 >>> y = x.rechunk({0: 1000})
317
318 Use the value ``-1`` to specify that you want a single chunk along a
319 dimension or the value ``"auto"`` to specify that dask can freely rechunk a
320 dimension to attain blocks of a uniform block size
321
322 >>> y = x.rechunk({0: -1, 1: 'auto'}, block_size_limit=1e8)
323
324 If a chunk size does not divide the dimension then rechunk will leave any
325 unevenness to the last chunk.
326
327 >>> x.rechunk(chunks=(400, -1)).chunks

Callers 5

rechunkMethod · 0.90
test_rechunk_1dFunction · 0.90
test_rechunk_2dFunction · 0.90
test_rechunk_4dFunction · 0.90
test_rechunk_blockshapeFunction · 0.90

Calls 9

validate_axisFunction · 0.90
normalize_chunksFunction · 0.90
allFunction · 0.85
_balance_chunksizesFunction · 0.85
_validate_rechunkFunction · 0.85
_choose_rechunk_methodFunction · 0.85
plan_rechunkFunction · 0.85
_compute_rechunkFunction · 0.70
itemsMethod · 0.45

Tested by 4

test_rechunk_1dFunction · 0.72
test_rechunk_2dFunction · 0.72
test_rechunk_4dFunction · 0.72
test_rechunk_blockshapeFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…