MCPcopy
hub / github.com/pathwaycom/pathway / join_outer

Method join_outer

python/pathway/internals/joins.py:457–541  ·  view source on GitHub ↗

Outer-joins two tables or join results. Args: other: the right side of the join, ``Table`` or ``JoinResult``. *on: Columns to join, syntax `self.col1 == other.col2` id: optional id column of the result instance: optional argument describing p

(
        self,
        other: Joinable,
        *on: expr.ColumnExpression,
        id: expr.ColumnReference | None = None,
        left_instance: expr.ColumnReference | None = None,
        right_instance: expr.ColumnReference | None = None,
        left_exactly_once: bool = False,
        right_exactly_once: bool = False,
    )

Source from the content-addressed store, hash-verified

455 @desugar(substitution={thisclass.left: "self", thisclass.right: "other"})
456 @arg_handler(handler=join_kwargs_handler(allow_how=False, allow_id=True))
457 def join_outer(
458 self,
459 other: Joinable,
460 *on: expr.ColumnExpression,
461 id: expr.ColumnReference | None = None,
462 left_instance: expr.ColumnReference | None = None,
463 right_instance: expr.ColumnReference | None = None,
464 left_exactly_once: bool = False,
465 right_exactly_once: bool = False,
466 ) -> JoinResult:
467 """Outer-joins two tables or join results.
468
469 Args:
470 other: the right side of the join, ``Table`` or ``JoinResult``.
471 *on: Columns to join, syntax `self.col1 == other.col2`
472 id: optional id column of the result
473 instance: optional argument describing partitioning of the data into separate instances
474 left_exactly_once: if you can guarantee that each row on the left side of the join will be
475 joined at most once, then you can set this parameter to ``True``. Then each row after
476 getting a match is removed from the join state. As a result, less memory is needed.
477 Works only for append-only tables.
478 right_exactly_once: if you can guarantee that each row on the right side of the join will be
479 joined at most once, then you can set this parameter to ``True``. Then each row after
480 getting a match is removed from the join state. As a result, less memory is needed.
481 Works only for append-only tables.
482
483 Remarks: args cannot contain id column from either of tables, \
484 as the result table has id column with auto-generated ids; \
485 it can be selected by assigning it to a column with defined \
486 name (passed in kwargs)
487
488 Behavior:
489 - for rows from the left side that were not matched with the right side,
490 missing values on the right are replaced with `None`
491 - for rows from the right side that were not matched with the left side,
492 missing values on the left are replaced with `None`
493 - for rows that were matched the behavior is the same as that of an inner join.
494
495 Returns:
496 JoinResult: an object on which `.select()` may be called to extract relevant
497 columns from the result of the join.
498
499 Example:
500
501 >>> import pathway as pw
502 >>> t1 = pw.debug.table_from_markdown(
503 ... '''
504 ... | a | b
505 ... 1 | 11 | 111
506 ... 2 | 12 | 112
507 ... 3 | 13 | 113
508 ... 4 | 13 | 114
509 ... '''
510 ... )
511 >>> t2 = pw.debug.table_from_markdown(
512 ... '''
513 ... | c | d
514 ... 1 | 11 | 211

Callers 15

join_outerFunction · 0.80
_wrapFunction · 0.80
test_outer_join_01Function · 0.80
test_outer_join_02Function · 0.80
test_outer_join_03Function · 0.80
test_outer_join_04Function · 0.80

Calls 1

_table_joinMethod · 0.80

Tested by 15

test_outer_join_01Function · 0.64
test_outer_join_02Function · 0.64
test_outer_join_03Function · 0.64
test_outer_join_04Function · 0.64
test_outer_join_idFunction · 0.64