Sort or "shuffle" the underlying object. "Shuffle" means the object is sorted so that all group members occur sequentially, in the same chunk. Multiple groups may occur in the same chunk. This method is particularly useful for chunked arrays (e.g. dask, cubed).
(self, chunks: T_Chunks = None)
| 713 | return self._sizes |
| 714 | |
| 715 | def shuffle_to_chunks(self, chunks: T_Chunks = None) -> T_Xarray: |
| 716 | """ |
| 717 | Sort or "shuffle" the underlying object. |
| 718 | |
| 719 | "Shuffle" means the object is sorted so that all group members occur sequentially, |
| 720 | in the same chunk. Multiple groups may occur in the same chunk. |
| 721 | This method is particularly useful for chunked arrays (e.g. dask, cubed). |
| 722 | particularly when you need to map a function that requires all members of a group |
| 723 | to be present in a single chunk. For chunked array types, the order of appearance |
| 724 | is not guaranteed, but will depend on the input chunking. |
| 725 | |
| 726 | Parameters |
| 727 | ---------- |
| 728 | chunks : int, tuple of int, "auto" or mapping of hashable to int or tuple of int, optional |
| 729 | How to adjust chunks along dimensions not present in the array being grouped by. |
| 730 | |
| 731 | Returns |
| 732 | ------- |
| 733 | DataArrayGroupBy or DatasetGroupBy |
| 734 | |
| 735 | Examples |
| 736 | -------- |
| 737 | >>> import dask.array |
| 738 | >>> da = xr.DataArray( |
| 739 | ... dims="x", |
| 740 | ... data=dask.array.arange(10, chunks=3), |
| 741 | ... coords={"x": [1, 2, 3, 1, 2, 3, 1, 2, 3, 0]}, |
| 742 | ... name="a", |
| 743 | ... ) |
| 744 | >>> shuffled = da.groupby("x").shuffle_to_chunks() |
| 745 | >>> shuffled |
| 746 | <xarray.DataArray 'a' (x: 10)> Size: 80B |
| 747 | dask.array<shuffle, shape=(10,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray> |
| 748 | Coordinates: |
| 749 | * x (x) int64 80B 0 1 1 1 2 2 2 3 3 3 |
| 750 | |
| 751 | >>> shuffled.groupby("x").quantile(q=0.5).compute() |
| 752 | <xarray.DataArray 'a' (x: 4)> Size: 32B |
| 753 | array([9., 3., 4., 5.]) |
| 754 | Coordinates: |
| 755 | * x (x) int64 32B 0 1 2 3 |
| 756 | quantile float64 8B 0.5 |
| 757 | |
| 758 | See Also |
| 759 | -------- |
| 760 | dask.dataframe.DataFrame.shuffle |
| 761 | dask.array.shuffle |
| 762 | """ |
| 763 | self._raise_if_by_is_chunked() |
| 764 | return self._shuffle_obj(chunks) |
| 765 | |
| 766 | def _shuffle_obj(self, chunks: T_Chunks) -> T_Xarray: |
| 767 | from xarray.core.dataarray import DataArray |