MCPcopy
hub / github.com/pydata/xarray / shuffle_to_chunks

Method shuffle_to_chunks

xarray/core/groupby.py:715–764  ·  view source on GitHub ↗

Sort or "shuffle" the underlying object. "Shuffle" means the object is sorted so that all group members occur sequentially, in the same chunk. Multiple groups may occur in the same chunk. This method is particularly useful for chunked arrays (e.g. dask, cubed).

(self, chunks: T_Chunks = None)

Source from the content-addressed store, hash-verified

713 return self._sizes
714
715 def shuffle_to_chunks(self, chunks: T_Chunks = None) -> T_Xarray:
716 """
717 Sort or "shuffle" the underlying object.
718
719 "Shuffle" means the object is sorted so that all group members occur sequentially,
720 in the same chunk. Multiple groups may occur in the same chunk.
721 This method is particularly useful for chunked arrays (e.g. dask, cubed).
722 particularly when you need to map a function that requires all members of a group
723 to be present in a single chunk. For chunked array types, the order of appearance
724 is not guaranteed, but will depend on the input chunking.
725
726 Parameters
727 ----------
728 chunks : int, tuple of int, "auto" or mapping of hashable to int or tuple of int, optional
729 How to adjust chunks along dimensions not present in the array being grouped by.
730
731 Returns
732 -------
733 DataArrayGroupBy or DatasetGroupBy
734
735 Examples
736 --------
737 >>> import dask.array
738 >>> da = xr.DataArray(
739 ... dims="x",
740 ... data=dask.array.arange(10, chunks=3),
741 ... coords={"x": [1, 2, 3, 1, 2, 3, 1, 2, 3, 0]},
742 ... name="a",
743 ... )
744 >>> shuffled = da.groupby("x").shuffle_to_chunks()
745 >>> shuffled
746 <xarray.DataArray 'a' (x: 10)> Size: 80B
747 dask.array<shuffle, shape=(10,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
748 Coordinates:
749 * x (x) int64 80B 0 1 1 1 2 2 2 3 3 3
750
751 >>> shuffled.groupby("x").quantile(q=0.5).compute()
752 <xarray.DataArray 'a' (x: 4)> Size: 32B
753 array([9., 3., 4., 5.])
754 Coordinates:
755 * x (x) int64 32B 0 1 2 3
756 quantile float64 8B 0.5
757
758 See Also
759 --------
760 dask.dataframe.DataFrame.shuffle
761 dask.array.shuffle
762 """
763 self._raise_if_by_is_chunked()
764 return self._shuffle_obj(chunks)
765
766 def _shuffle_obj(self, chunks: T_Chunks) -> T_Xarray:
767 from xarray.core.dataarray import DataArray

Callers 8

test_groupby_drops_nansFunction · 0.45
test_groupby_binsMethod · 0.45
test_resampleMethod · 0.45
test_multiple_groupersFunction · 0.45
test_shuffle_simpleFunction · 0.45
test_shuffle_byFunction · 0.45

Calls 2

_shuffle_objMethod · 0.95

Tested by 8

test_groupby_drops_nansFunction · 0.36
test_groupby_binsMethod · 0.36
test_resampleMethod · 0.36
test_multiple_groupersFunction · 0.36
test_shuffle_simpleFunction · 0.36
test_shuffle_byFunction · 0.36