MCPcopy Index your code
hub / github.com/pathwaycom/pathway / join_outer

Function join_outer

python/pathway/internals/joins.py:1482–1566  ·  view source on GitHub ↗

Outer-joins two tables or join results. Args: self: the left side of the join, ``Table`` or ``JoinResult``. other: the right side of the join, ``Table`` or ``JoinResult``. *on: Columns to join, syntax `self.col1 == other.col2` id: optional id column of the resu

(
    left: Joinable,
    right: Joinable,
    *on: expr.ColumnExpression,
    id: expr.ColumnReference | None = None,
    left_instance: expr.ColumnReference | None = None,
    right_instance: expr.ColumnReference | None = None,
    left_exactly_once: bool = False,
    right_exactly_once: bool = False,
)

Source from the content-addressed store, hash-verified

1480
1481
1482def join_outer(
1483 left: Joinable,
1484 right: Joinable,
1485 *on: expr.ColumnExpression,
1486 id: expr.ColumnReference | None = None,
1487 left_instance: expr.ColumnReference | None = None,
1488 right_instance: expr.ColumnReference | None = None,
1489 left_exactly_once: bool = False,
1490 right_exactly_once: bool = False,
1491) -> JoinResult:
1492 """Outer-joins two tables or join results.
1493
1494 Args:
1495 self: the left side of the join, ``Table`` or ``JoinResult``.
1496 other: the right side of the join, ``Table`` or ``JoinResult``.
1497 *on: Columns to join, syntax `self.col1 == other.col2`
1498 id: optional id column of the result
1499 left_instance/right_instance: optional arguments describing partitioning of the data into separate
1500 instances
1501 left_exactly_once: if you can guarantee that each row on the left side of the join will be
1502 joined at most once, then you can set this parameter to ``True``. Then each row after
1503 getting a match is removed from the join state. As a result, less memory is needed.
1504 Works only for append-only tables.
1505 right_exactly_once: if you can guarantee that each row on the right side of the join will be
1506 joined at most once, then you can set this parameter to ``True``. Then each row after
1507 getting a match is removed from the join state. As a result, less memory is needed.
1508 Works only for append-only tables.
1509
1510 Remarks: args cannot contain id column from either of tables, \
1511 as the result table has id column with auto-generated ids; \
1512 it can be selected by assigning it to a column with defined \
1513 name (passed in kwargs)
1514
1515 Behavior:
1516 - for rows from the left side that were not matched with the right side,
1517 missing values on the right are replaced with `None`
1518 - for rows from the right side that were not matched with the left side,
1519 missing values on the left are replaced with `None`
1520 - for rows that were matched the behavior is the same as that of an inner join.
1521
1522 Returns:
1523 JoinResult: an object on which `.select()` may be called to extract relevant
1524 columns from the result of the join.
1525
1526 Example:
1527
1528 >>> import pathway as pw
1529 >>> t1 = pw.debug.table_from_markdown(
1530 ... '''
1531 ... | a | b
1532 ... 1 | 11 | 111
1533 ... 2 | 12 | 112
1534 ... 3 | 13 | 113
1535 ... 4 | 13 | 114
1536 ... '''
1537 ... )
1538 >>> t2 = pw.debug.table_from_markdown(
1539 ... '''

Callers

nothing calls this directly

Calls 1

join_outerMethod · 0.80

Tested by

no test coverage detected