MCPcopy
hub / github.com/ray-project/ray / filter

Method filter

python/ray/data/dataset.py:1551–1753  ·  view source on GitHub ↗

Filter out rows that don't satisfy the given predicate. You can use either a function or a callable class or an expression to perform the transformation. For functions, Ray Data uses stateless Ray tasks. For classes, Ray Data uses stateful Ray actors. For more inform

(
        self,
        fn: Optional[UserDefinedFunction[Dict[str, Any], bool]] = None,
        expr: Optional[Union[str, Expr]] = None,
        *,
        compute: Union[str, ComputeStrategy] = None,
        fn_args: Optional[Iterable[Any]] = None,
        fn_kwargs: Optional[Dict[str, Any]] = None,
        fn_constructor_args: Optional[Iterable[Any]] = None,
        fn_constructor_kwargs: Optional[Dict[str, Any]] = None,
        num_cpus: Optional[float] = None,
        num_gpus: Optional[float] = None,
        memory: Optional[float] = None,
        concurrency: Optional[Union[int, Tuple[int, int], Tuple[int, int, int]]] = None,
        ray_remote_args_fn: Optional[Callable[[], Dict[str, Any]]] = None,
        **ray_remote_args,
    )

Source from the content-addressed store, hash-verified

1549
1550 @PublicAPI(api_group=BT_API_GROUP)
1551 def filter(
1552 self,
1553 fn: Optional[UserDefinedFunction[Dict[str, Any], bool]] = None,
1554 expr: Optional[Union[str, Expr]] = None,
1555 *,
1556 compute: Union[str, ComputeStrategy] = None,
1557 fn_args: Optional[Iterable[Any]] = None,
1558 fn_kwargs: Optional[Dict[str, Any]] = None,
1559 fn_constructor_args: Optional[Iterable[Any]] = None,
1560 fn_constructor_kwargs: Optional[Dict[str, Any]] = None,
1561 num_cpus: Optional[float] = None,
1562 num_gpus: Optional[float] = None,
1563 memory: Optional[float] = None,
1564 concurrency: Optional[Union[int, Tuple[int, int], Tuple[int, int, int]]] = None,
1565 ray_remote_args_fn: Optional[Callable[[], Dict[str, Any]]] = None,
1566 **ray_remote_args,
1567 ) -> "Dataset":
1568 """Filter out rows that don't satisfy the given predicate.
1569
1570 You can use either a function or a callable class or an expression to
1571 perform the transformation.
1572 For functions, Ray Data uses stateless Ray tasks. For classes, Ray Data uses
1573 stateful Ray actors. For more information, see
1574 :ref:`Stateful Transforms <stateful_transforms>`.
1575
1576 .. tip::
1577 If you use the `expr` parameter with a predicate expression, Ray Data
1578 optimizes your filter with native Arrow interfaces.
1579
1580 .. deprecated::
1581 String expressions are deprecated and will be removed in a future version.
1582 Use predicate expressions from `ray.data.expressions` instead.
1583
1584 Examples:
1585
1586 >>> import ray
1587 >>> from ray.data.expressions import col
1588 >>> ds = ray.data.range(100)
1589 >>> # String expressions (deprecated - will warn)
1590 >>> ds.filter(expr="id <= 4").take_all()
1591 [{'id': 0}, {'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]
1592 >>> # Using predicate expressions (preferred)
1593 >>> ds.filter(expr=(col("id") > 10) & (col("id") < 20)).take_all()
1594 [{'id': 11}, {'id': 12}, {'id': 13}, {'id': 14}, {'id': 15}, {'id': 16}, {'id': 17}, {'id': 18}, {'id': 19}]
1595
1596 Time complexity: O(dataset size / parallelism)
1597
1598 Args:
1599 fn: The predicate to apply to each row, or a class type
1600 that can be instantiated to create such a callable.
1601 expr: An expression that represents a predicate (boolean condition) for filtering.
1602 Can be either a string expression (deprecated) or a predicate expression
1603 from `ray.data.expressions`.
1604 compute: The compute strategy to use for the map operation.
1605
1606 * If ``compute`` is not specified for a function, will use ``ray.data.TaskPoolStrategy()`` to launch concurrent tasks based on the available resources and number of input blocks.
1607
1608 * Use ``ray.data.TaskPoolStrategy(size=n)`` to launch at most ``n`` concurrent Ray tasks.

Callers 15

_get_vpc_id_of_sgFunction · 0.45
_get_subnets_or_dieFunction · 0.45
_get_security_groupsFunction · 0.45
_get_keyFunction · 0.45
non_terminated_nodesMethod · 0.45
create_nodeMethod · 0.45
_get_nodeMethod · 0.45
dFunction · 0.45
lFunction · 0.45

Calls 7

get_compute_strategyFunction · 0.90
FilterClass · 0.90
LogicalPlanClass · 0.90
_from_parentMethod · 0.80
sumFunction · 0.50

Tested by

no test coverage detected