Generate default metrics for all columns. This function returns a list of aggregators that compute the following metrics: - count - missing_value_percentage - approximate_top_k (top 10 most frequent values) Args: column: The name of the column to compute metrics for.
(column: str)
| 257 | |
| 258 | |
| 259 | def _basic_aggregators(column: str) -> List[AggregateFnV2]: |
| 260 | """Generate default metrics for all columns. |
| 261 | |
| 262 | This function returns a list of aggregators that compute the following metrics: |
| 263 | - count |
| 264 | - missing_value_percentage |
| 265 | - approximate_top_k (top 10 most frequent values) |
| 266 | |
| 267 | Args: |
| 268 | column: The name of the column to compute metrics for. |
| 269 | |
| 270 | Returns: |
| 271 | A list of AggregateFnV2 instances that can be used with Dataset.aggregate() |
| 272 | """ |
| 273 | return [ |
| 274 | Count(on=column, ignore_nulls=False), |
| 275 | MissingValuePercentage(on=column), |
| 276 | ApproximateTopK(on=column, k=10), |
| 277 | ] |
| 278 | |
| 279 | |
| 280 | def _default_dtype_aggregators() -> Dict[ |
searching dependent graphs…