MCPcopy Index your code
hub / github.com/pathwaycom/pathway / deduplicate

Function deduplicate

python/pathway/stdlib/stateful/deduplicate.py:9–31  ·  view source on GitHub ↗

Deduplicates rows in `table` on `col` column using acceptor function. It keeps rows which where accepted by the acceptor function. Acceptor operates on two arguments - current value and the previous accepted value. Args: table (pw.Table[TSchema]): table to deduplicate c

(
    table: pw.Table[TSchema],
    *,
    col: pw.ColumnReference,
    instance: pw.ColumnExpression | None = None,
    acceptor: Callable[[T, T], bool],
)

Source from the content-addressed store, hash-verified

7
8
9def deduplicate(
10 table: pw.Table[TSchema],
11 *,
12 col: pw.ColumnReference,
13 instance: pw.ColumnExpression | None = None,
14 acceptor: Callable[[T, T], bool],
15) -> pw.Table[TSchema]:
16 """Deduplicates rows in `table` on `col` column using acceptor function.
17
18 It keeps rows which where accepted by the acceptor function.
19 Acceptor operates on two arguments - current value and the previous accepted value.
20
21 Args:
22 table (pw.Table[TSchema]): table to deduplicate
23 col (pw.ColumnReference): column used for deduplication
24 acceptor (Callable[[T, T], bool]): callback telling whether two values are different
25 instance (pw.ColumnExpression, optional): Group column for which deduplication will be performed separately.
26 Defaults to None.
27
28 Returns:
29 pw.Table[TSchema]:
30 """
31 return table.deduplicate(value=col, instance=instance, acceptor=acceptor)

Callers

nothing calls this directly

Calls 1

deduplicateMethod · 0.80

Tested by

no test coverage detected