MCPcopy Index your code
hub / github.com/huggingface/datasets / cast_to_python_objects

Function cast_to_python_objects

src/datasets/features/features.py:468–490  ·  view source on GitHub ↗

Cast numpy/pytorch/tensorflow/pandas objects to python lists. It works recursively. If `optimize_list_casting` is True, To avoid iterating over possibly long lists, it first checks (recursively) if the first element that is not None or empty (if it is a sequence) has to be casted.

(obj: Any, only_1d_for_numpy=False, optimize_list_casting=True)

Source from the content-addressed store, hash-verified

466
467
468def cast_to_python_objects(obj: Any, only_1d_for_numpy=False, optimize_list_casting=True) -> Any:
469 """
470 Cast numpy/pytorch/tensorflow/pandas objects to python lists.
471 It works recursively.
472
473 If `optimize_list_casting` is True, To avoid iterating over possibly long lists, it first checks (recursively) if the first element that is not None or empty (if it is a sequence) has to be casted.
474 If the first element needs to be casted, then all the elements of the list will be casted, otherwise they'll stay the same.
475 This trick allows to cast objects that contain tokenizers outputs without iterating over every single token for example.
476
477 Args:
478 obj: the object (nested struct) to cast
479 only_1d_for_numpy (bool, default ``False``): whether to keep the full multi-dim tensors as multi-dim numpy arrays, or convert them to
480 nested lists of 1-dimensional numpy arrays. This can be useful to keep only 1-d arrays to instantiate Arrow arrays.
481 Indeed Arrow only support converting 1-dimensional array values.
482 optimize_list_casting (bool, default ``True``): whether to optimize list casting by checking the first non-null element to see if it needs to be casted
483 and if it doesn't, not checking the rest of the list elements.
484
485 Returns:
486 casted_obj: the casted object
487 """
488 return _cast_to_python_objects(
489 obj, only_1d_for_numpy=only_1d_for_numpy, optimize_list_casting=optimize_list_casting
490 )[0]
491
492
493@dataclass(repr=False)

Calls 1

_cast_to_python_objectsFunction · 0.85