Create a :class:`~ray.data.Dataset` from a list of local Python objects. Use this method to create small datasets from data that fits in memory. The column name defaults to "item". Examples: >>> import ray >>> ds = ray.data.from_items([1, 2, 3, 4, 5]) >>> ds #
(
items: List[Any],
*,
parallelism: int = -1,
override_num_blocks: Optional[int] = None,
)
| 164 | |
| 165 | @PublicAPI |
| 166 | def from_items( |
| 167 | items: List[Any], |
| 168 | *, |
| 169 | parallelism: int = -1, |
| 170 | override_num_blocks: Optional[int] = None, |
| 171 | ) -> MaterializedDataset: |
| 172 | """Create a :class:`~ray.data.Dataset` from a list of local Python objects. |
| 173 | |
| 174 | Use this method to create small datasets from data that fits in memory. The column |
| 175 | name defaults to "item". |
| 176 | |
| 177 | Examples: |
| 178 | |
| 179 | >>> import ray |
| 180 | >>> ds = ray.data.from_items([1, 2, 3, 4, 5]) |
| 181 | >>> ds # doctest: +ELLIPSIS |
| 182 | shape: (5, 1) |
| 183 | ╭───────╮ |
| 184 | │ item │ |
| 185 | │ --- │ |
| 186 | │ int64 │ |
| 187 | ╞═══════╡ |
| 188 | │ 1 │ |
| 189 | │ 2 │ |
| 190 | │ 3 │ |
| 191 | │ 4 │ |
| 192 | │ 5 │ |
| 193 | ╰───────╯ |
| 194 | (Showing 5 of 5 rows) |
| 195 | >>> ds.schema() |
| 196 | Column Type |
| 197 | ------ ---- |
| 198 | item int64 |
| 199 | |
| 200 | Args: |
| 201 | items: List of local Python objects. |
| 202 | parallelism: This argument is deprecated. Use ``override_num_blocks`` argument. |
| 203 | override_num_blocks: Override the number of output blocks from all read tasks. |
| 204 | By default, the number of output blocks is dynamically decided based on |
| 205 | input data size and available resources. You shouldn't manually set this |
| 206 | value in most cases. |
| 207 | |
| 208 | Returns: |
| 209 | A :class:`~ray.data.Dataset` holding the items. |
| 210 | """ |
| 211 | import builtins |
| 212 | |
| 213 | parallelism = _get_num_output_blocks(parallelism, override_num_blocks) |
| 214 | if parallelism == 0: |
| 215 | raise ValueError(f"parallelism must be -1 or > 0, got: {parallelism}") |
| 216 | |
| 217 | detected_parallelism, _, _ = _autodetect_parallelism( |
| 218 | parallelism, |
| 219 | ray.util.get_current_placement_group(), |
| 220 | DataContext.get_current(), |
| 221 | ) |
| 222 | # Truncate parallelism to number of items to avoid empty blocks. |
| 223 | detected_parallelism = min(len(items), detected_parallelism) |
no test coverage detected
searching dependent graphs…