MCPcopy
hub / github.com/ray-project/ray / from_items

Function from_items

python/ray/data/read_api.py:166–254  ·  view source on GitHub ↗

Create a :class:`~ray.data.Dataset` from a list of local Python objects. Use this method to create small datasets from data that fits in memory. The column name defaults to "item". Examples: >>> import ray >>> ds = ray.data.from_items([1, 2, 3, 4, 5]) >>> ds #

(
    items: List[Any],
    *,
    parallelism: int = -1,
    override_num_blocks: Optional[int] = None,
)

Source from the content-addressed store, hash-verified

164
165@PublicAPI
166def from_items(
167 items: List[Any],
168 *,
169 parallelism: int = -1,
170 override_num_blocks: Optional[int] = None,
171) -> MaterializedDataset:
172 """Create a :class:`~ray.data.Dataset` from a list of local Python objects.
173
174 Use this method to create small datasets from data that fits in memory. The column
175 name defaults to "item".
176
177 Examples:
178
179 >>> import ray
180 >>> ds = ray.data.from_items([1, 2, 3, 4, 5])
181 >>> ds # doctest: +ELLIPSIS
182 shape: (5, 1)
183 ╭───────╮
184 │ item │
185 │ --- │
186 │ int64 │
187 ╞═══════╡
188 │ 1 │
189 │ 2 │
190 │ 3 │
191 │ 4 │
192 │ 5 │
193 ╰───────╯
194 (Showing 5 of 5 rows)
195 >>> ds.schema()
196 Column Type
197 ------ ----
198 item int64
199
200 Args:
201 items: List of local Python objects.
202 parallelism: This argument is deprecated. Use ``override_num_blocks`` argument.
203 override_num_blocks: Override the number of output blocks from all read tasks.
204 By default, the number of output blocks is dynamically decided based on
205 input data size and available resources. You shouldn't manually set this
206 value in most cases.
207
208 Returns:
209 A :class:`~ray.data.Dataset` holding the items.
210 """
211 import builtins
212
213 parallelism = _get_num_output_blocks(parallelism, override_num_blocks)
214 if parallelism == 0:
215 raise ValueError(f"parallelism must be -1 or > 0, got: {parallelism}")
216
217 detected_parallelism, _, _ = _autodetect_parallelism(
218 parallelism,
219 ray.util.get_current_placement_group(),
220 DataContext.get_current(),
221 )
222 # Truncate parallelism to number of items to avoid empty blocks.
223 detected_parallelism = min(len(items), detected_parallelism)

Callers 1

from_tfFunction · 0.70

Calls 15

addMethod · 0.95
buildMethod · 0.95
_autodetect_parallelismFunction · 0.90
FromItemsClass · 0.90
DatasetStatsClass · 0.90
LogicalPlanClass · 0.90
MaterializedDatasetClass · 0.90
_get_num_output_blocksFunction · 0.85
from_blockMethod · 0.80
putMethod · 0.65
copyMethod · 0.65

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…