Presents a `seaborn` heatmap visualization of nullity correlation in the given DataFrame. Note that this visualization has no special support for large datasets. For those, try the dendrogram instead. :param df: The DataFrame whose completeness is being heatmapped. :param
(df, inline=True,
filter=None, n=0, p=0, sort=None,
figsize=(20, 12), fontsize=16, labels=True, cmap='RdBu'
)
| 379 | |
| 380 | |
| 381 | def heatmap(df, inline=True, |
| 382 | filter=None, n=0, p=0, sort=None, |
| 383 | figsize=(20, 12), fontsize=16, labels=True, cmap='RdBu' |
| 384 | ): |
| 385 | """ |
| 386 | Presents a `seaborn` heatmap visualization of nullity correlation in the given DataFrame. |
| 387 | |
| 388 | Note that this visualization has no special support for large datasets. For those, try the dendrogram instead. |
| 389 | |
| 390 | |
| 391 | :param df: The DataFrame whose completeness is being heatmapped. |
| 392 | :param filter: The filter to apply to the heatmap. Should be one of "top", "bottom", or None (default). See |
| 393 | `nullity_filter()` for more information. |
| 394 | :param n: The cap on the number of columns to include in the filtered DataFrame. See `nullity_filter()` for |
| 395 | more information. |
| 396 | :param p: The cap on the percentage fill of the columns in the filtered DataFrame. See `nullity_filter()` for |
| 397 | more information. |
| 398 | :param sort: The sort to apply to the heatmap. Should be one of "ascending", "descending", or None. See |
| 399 | `nullity_sort()` for more information. |
| 400 | :param figsize: The size of the figure to display. This is a `matplotlib` parameter which defaults to (20, 12). |
| 401 | :param fontsize: The figure's font size. |
| 402 | :param labels: Whether or not to label each matrix entry with its correlation (default is True). |
| 403 | :param cmap: What `matplotlib` colormap to use. Defaults to `RdBu`. |
| 404 | :param inline: Whether or not the figure is inline. If it's not then instead of getting plotted, this method will |
| 405 | return its figure. |
| 406 | :return: If `inline` is True, the underlying `matplotlib.figure` object. Else, nothing. |
| 407 | """ |
| 408 | # Apply filters and sorts. |
| 409 | df = nullity_filter(df, filter=filter, n=n, p=p) |
| 410 | df = nullity_sort(df, sort=sort) |
| 411 | |
| 412 | # Set up the figure. |
| 413 | fig = plt.figure(figsize=figsize) |
| 414 | gs = gridspec.GridSpec(1, 1) |
| 415 | ax0 = plt.subplot(gs[0]) |
| 416 | |
| 417 | # Pre-processing: remove completely filled or completely empty variables. |
| 418 | df = df[[i for i, n in enumerate(np.var(df.isnull(), axis='rows')) if n > 0]] |
| 419 | |
| 420 | # Create and mask the correlation matrix. |
| 421 | corr_mat = df.isnull().corr() |
| 422 | # corr_mat = corr_mat.replace(np.nan, 1) |
| 423 | # corr_mat[np.isnan(corr_mat)] = 0 |
| 424 | mask = np.zeros_like(corr_mat) |
| 425 | mask[np.triu_indices_from(mask)] = True |
| 426 | |
| 427 | # Set fontsize. |
| 428 | # fontsize = _set_font_size(fig, df, fontsize) |
| 429 | |
| 430 | # Construct the base heatmap. |
| 431 | if labels: |
| 432 | sns.heatmap(corr_mat, mask=mask, cmap=cmap, ax=ax0, cbar=False, |
| 433 | annot=True, annot_kws={"size": fontsize - 2}) |
| 434 | else: |
| 435 | sns.heatmap(corr_mat, mask=mask, cmap=cmap, ax=ax0, cbar=False) |
| 436 | |
| 437 | # Apply visual corrections and modifications. |
| 438 | ax0.set_xticklabels(ax0.xaxis.get_majorticklabels(), rotation=45, ha='left', fontsize=fontsize) |
nothing calls this directly
no test coverage detected