The `normalize_chunks` algorithm takes a size `limit` in bytes, but will not work for object dtypes. So we rescale the `limit` to an appropriate one based on `float64` dtype, and pass that to `normalize_chunks`. Arguments --------- data : Variable or ChunkedArray T
(
data: DuckArray[Any],
limit: int,
)
| 265 | |
| 266 | |
| 267 | def fake_target_chunksize( |
| 268 | data: DuckArray[Any], |
| 269 | limit: int, |
| 270 | ) -> tuple[int, np.dtype[Any]]: |
| 271 | """ |
| 272 | The `normalize_chunks` algorithm takes a size `limit` in bytes, but will not |
| 273 | work for object dtypes. So we rescale the `limit` to an appropriate one based |
| 274 | on `float64` dtype, and pass that to `normalize_chunks`. |
| 275 | |
| 276 | Arguments |
| 277 | --------- |
| 278 | data : Variable or ChunkedArray |
| 279 | The data for which we want to determine chunk sizes. |
| 280 | limit : int |
| 281 | The target chunk size in bytes. Passed to the chunk manager's `normalize_chunks` method. |
| 282 | """ |
| 283 | |
| 284 | # Short circuit for non-object dtypes |
| 285 | from xarray.core.common import _contains_cftime_datetimes |
| 286 | |
| 287 | if not _contains_cftime_datetimes(data): |
| 288 | return limit, data.dtype |
| 289 | |
| 290 | from xarray.core.formatting import first_n_items |
| 291 | |
| 292 | output_dtype = np.dtype(np.float64) |
| 293 | |
| 294 | nbytes_approx: int = sys.getsizeof(first_n_items(data, 1)) # type: ignore[no-untyped-call] |
| 295 | |
| 296 | f64_nbytes = output_dtype.itemsize |
| 297 | |
| 298 | limit = int(limit * (f64_nbytes / nbytes_approx)) |
| 299 | |
| 300 | return limit, output_dtype |
| 301 | |
| 302 | |
| 303 | class ReprObject: |
searching dependent graphs…