MCPcopy
hub / github.com/pathwaycom/pathway / groupby_reduce_majority

Function groupby_reduce_majority

python/pathway/stdlib/utils/col.py:309–350  ·  view source on GitHub ↗

Finds a majority in column_val for every group in column_group. Workaround for missing majority reducer. Example: >>> import pathway as pw >>> table = pw.debug.table_from_markdown( ... ''' ... | group | vote ... 0 | 1 | pizza ... 1 | 1 | pizza ... 2 |

(
    column_group: pw.ColumnReference, column_val: pw.ColumnReference
)

Source from the content-addressed store, hash-verified

307@check_arg_types
308@trace_user_frame
309def groupby_reduce_majority(
310 column_group: pw.ColumnReference, column_val: pw.ColumnReference
311):
312 """Finds a majority in column_val for every group in column_group.
313
314 Workaround for missing majority reducer.
315
316 Example:
317
318 >>> import pathway as pw
319 >>> table = pw.debug.table_from_markdown(
320 ... '''
321 ... | group | vote
322 ... 0 | 1 | pizza
323 ... 1 | 1 | pizza
324 ... 2 | 1 | hotdog
325 ... 3 | 2 | hotdog
326 ... 4 | 2 | pasta
327 ... 5 | 2 | pasta
328 ... 6 | 2 | pasta
329 ... ''')
330 >>> result = pw.utils.col.groupby_reduce_majority(table.group, table.vote)
331 >>> pw.debug.compute_and_print(result, include_id=False)
332 group | majority
333 1 | pizza
334 2 | pasta
335 """
336 table = column_group.table
337 column_val = table[column_val] # in case its pw.this reference
338 column_val_name = column_val.name
339 column_group_name = column_group.name
340 counts = table.groupby(column_group, column_val).reduce(
341 column_group, column_val, _pw_special_count=pw.reducers.count()
342 )
343 res = counts.groupby(counts[column_group_name]).reduce(
344 counts[column_group_name],
345 majority=counts.ix(
346 pw.reducers.argmax(counts._pw_special_count), context=pw.this
347 )[column_val_name],
348 )
349
350 return res

Callers 2

Calls 4

countMethod · 0.80
reduceMethod · 0.45
groupbyMethod · 0.45
ixMethod · 0.45

Tested by 1