MCPcopy
hub / github.com/pydata/xarray / fake_target_chunksize

Function fake_target_chunksize

xarray/namedarray/utils.py:267–300  ·  view source on GitHub ↗

The `normalize_chunks` algorithm takes a size `limit` in bytes, but will not work for object dtypes. So we rescale the `limit` to an appropriate one based on `float64` dtype, and pass that to `normalize_chunks`. Arguments --------- data : Variable or ChunkedArray T

(
    data: DuckArray[Any],
    limit: int,
)

Source from the content-addressed store, hash-verified

265
266
267def fake_target_chunksize(
268 data: DuckArray[Any],
269 limit: int,
270) -> tuple[int, np.dtype[Any]]:
271 """
272 The `normalize_chunks` algorithm takes a size `limit` in bytes, but will not
273 work for object dtypes. So we rescale the `limit` to an appropriate one based
274 on `float64` dtype, and pass that to `normalize_chunks`.
275
276 Arguments
277 ---------
278 data : Variable or ChunkedArray
279 The data for which we want to determine chunk sizes.
280 limit : int
281 The target chunk size in bytes. Passed to the chunk manager's `normalize_chunks` method.
282 """
283
284 # Short circuit for non-object dtypes
285 from xarray.core.common import _contains_cftime_datetimes
286
287 if not _contains_cftime_datetimes(data):
288 return limit, data.dtype
289
290 from xarray.core.formatting import first_n_items
291
292 output_dtype = np.dtype(np.float64)
293
294 nbytes_approx: int = sys.getsizeof(first_n_items(data, 1)) # type: ignore[no-untyped-call]
295
296 f64_nbytes = output_dtype.itemsize
297
298 limit = int(limit * (f64_nbytes / nbytes_approx))
299
300 return limit, output_dtype
301
302
303class ReprObject:

Callers 3

_get_chunkFunction · 0.85

Calls 3

first_n_itemsFunction · 0.90
dtypeMethod · 0.45

Tested by 2

Used in the wild real call sites across dependent graphs

searching dependent graphs…