MCPcopy
hub / github.com/ray-project/ray / read_audio

Function read_audio

python/ray/data/read_api.py:724–821  ·  view source on GitHub ↗

Creates a :class:`~ray.data.Dataset` from audio files. The column names default to "amplitude" and "sample_rate". Examples: >>> import ray >>> path = "s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac/train-clean-100/5022/29411/5022-29411-0000

(
    paths: Union[str, List[str]],
    *,
    filesystem: Optional["pyarrow.fs.FileSystem"] = None,
    arrow_open_stream_args: Optional[Dict[str, Any]] = None,
    partition_filter: Optional[PathPartitionFilter] = None,
    partitioning: Optional[Partitioning] = None,
    include_paths: bool = False,
    ignore_missing_paths: bool = False,
    file_extensions: Optional[List[str]] = AudioDatasource._FILE_EXTENSIONS,
    shuffle: Union[Literal["files"], None] = None,
    concurrency: Optional[int] = None,
    override_num_blocks: Optional[int] = None,
    num_cpus: Optional[float] = None,
    num_gpus: Optional[float] = None,
    memory: Optional[float] = None,
    ray_remote_args: Optional[Dict[str, Any]] = None,
)

Source from the content-addressed store, hash-verified

722
723@PublicAPI(stability="alpha")
724def read_audio(
725 paths: Union[str, List[str]],
726 *,
727 filesystem: Optional["pyarrow.fs.FileSystem"] = None,
728 arrow_open_stream_args: Optional[Dict[str, Any]] = None,
729 partition_filter: Optional[PathPartitionFilter] = None,
730 partitioning: Optional[Partitioning] = None,
731 include_paths: bool = False,
732 ignore_missing_paths: bool = False,
733 file_extensions: Optional[List[str]] = AudioDatasource._FILE_EXTENSIONS,
734 shuffle: Union[Literal["files"], None] = None,
735 concurrency: Optional[int] = None,
736 override_num_blocks: Optional[int] = None,
737 num_cpus: Optional[float] = None,
738 num_gpus: Optional[float] = None,
739 memory: Optional[float] = None,
740 ray_remote_args: Optional[Dict[str, Any]] = None,
741):
742 """Creates a :class:`~ray.data.Dataset` from audio files.
743
744 The column names default to "amplitude" and "sample_rate".
745
746 Examples:
747 >>> import ray
748 >>> path = "s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac/train-clean-100/5022/29411/5022-29411-0000.flac"
749 >>> ds = ray.data.read_audio(path)
750 >>> ds.schema()
751 Column Type
752 ------ ----
753 amplitude ArrowTensorTypeV2(shape=(1, 191760), dtype=float)
754 sample_rate int64
755
756 Args:
757 paths: A single file or directory, or a list of file or directory paths.
758 A list of paths can contain both files and directories.
759 filesystem: The pyarrow filesystem
760 implementation to read from. These filesystems are specified in the
761 `pyarrow docs <https://arrow.apache.org/docs/python/api/\
762 filesystems.html#filesystem-implementations>`_. Specify this parameter if
763 you need to provide specific configurations to the filesystem. By default,
764 the filesystem is automatically selected based on the scheme of the paths.
765 For example, if the path begins with ``s3://``, the `S3FileSystem` is used.
766 arrow_open_stream_args: kwargs passed to
767 `pyarrow.fs.FileSystem.open_input_file <https://arrow.apache.org/docs/\
768 python/generated/pyarrow.fs.FileSystem.html\
769 #pyarrow.fs.FileSystem.open_input_file>`_.
770 when opening input files to read.
771 partition_filter: A
772 :class:`~ray.data.datasource.partitioning.PathPartitionFilter`. Use
773 with a custom callback to read only selected partitions of a dataset.
774 partitioning: A :class:`~ray.data.datasource.partitioning.Partitioning` object
775 that describes how paths are organized. Defaults to ``None``.
776 include_paths: If ``True``, include the path to each image. File paths are
777 stored in the ``'path'`` column.
778 ignore_missing_paths: If True, ignores any file/directory paths in ``paths``
779 that are not found. Defaults to False.
780 file_extensions: A list of file extensions to filter files by.
781 shuffle: If ``"files"``, randomly shuffle input files order before read.

Callers

nothing calls this directly

Calls 2

AudioDatasourceClass · 0.90
read_datasourceFunction · 0.85

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…