MCPcopy
hub / github.com/ray-project/ray / range

Function range

python/ray/data/read_api.py:258–312  ·  view source on GitHub ↗

Creates a :class:`~ray.data.Dataset` from a range of integers [0..n). This function allows for easy creation of synthetic datasets for testing or benchmarking :ref:`Ray Data `. The column name defaults to "id". Examples: >>> import ray >>> ds = ray.data.range(100

(
    n: int,
    *,
    parallelism: int = -1,
    concurrency: Optional[int] = None,
    override_num_blocks: Optional[int] = None,
)

Source from the content-addressed store, hash-verified

256
257@PublicAPI
258def range(
259 n: int,
260 *,
261 parallelism: int = -1,
262 concurrency: Optional[int] = None,
263 override_num_blocks: Optional[int] = None,
264) -> Dataset:
265 """Creates a :class:`~ray.data.Dataset` from a range of integers [0..n).
266
267 This function allows for easy creation of synthetic datasets for testing or
268 benchmarking :ref:`Ray Data <data>`. The column name defaults to "id".
269
270 Examples:
271
272 >>> import ray
273 >>> ds = ray.data.range(10000)
274 >>> ds # doctest: +ELLIPSIS
275 shape: (10000, 1)
276 ╭───────╮
277 │ id │
278 │ --- │
279 │ int64 │
280 ╰───────╯
281 (Dataset isn&#x27;t materialized)
282 >>> ds.map(lambda row: {"id": row["id"] * 2}).take(4)
283 [{'id': 0}, {'id': 2}, {'id': 4}, {'id': 6}]
284
285 Args:
286 n: The upper bound of the range of integers.
287 parallelism: This argument is deprecated. Use ``override_num_blocks`` argument.
288 concurrency: The maximum number of Ray tasks to run concurrently. Set this
289 to control number of tasks to run concurrently. This doesn&#x27;t change the
290 total number of tasks run or the total number of output blocks. By default,
291 concurrency is dynamically decided based on the available resources.
292 override_num_blocks: Override the number of output blocks from all read tasks.
293 By default, the number of output blocks is dynamically decided based on
294 input data size and available resources. You shouldn&#x27;t manually set this
295 value in most cases.
296
297 Returns:
298 A :class:`~ray.data.Dataset` producing the integers from the range 0 to n.
299
300 .. seealso::
301
302 :meth:`~ray.data.range_tensor`
303 Call this method for creating synthetic datasets of tensor data.
304
305 """
306 datasource = RangeDatasource(n=n, block_format="arrow", column_name="id")
307 return read_datasource(
308 datasource,
309 parallelism=parallelism,
310 concurrency=concurrency,
311 override_num_blocks=override_num_blocks,
312 )
313
314
315@PublicAPI

Callers 15

__init__Method · 0.70
splitMethod · 0.70
split_proportionatelyMethod · 0.70
_format_statsFunction · 0.70
_bindMethod · 0.50
call_with_retryFunction · 0.50
recover_argsFunction · 0.50
get_rayllm_testing_modelFunction · 0.50
testing_multiple_modelsFunction · 0.50
test_sort_bundlesFunction · 0.50

Calls 2

RangeDatasourceClass · 0.90
read_datasourceFunction · 0.85

Used in the wild real call sites across dependent graphs

searching dependent graphs…