Presents a `matplotlib` matrix visualization of the nullity of the given DataFrame. Note that for the default `figsize` 250 is a soft display limit: specifying a number of records greater than approximately this value will cause certain records to show up in the sparkline but not in th
(df,
filter=None, n=0, p=0, sort=None,
figsize=(25, 10), width_ratios=(15, 1), color=(0.25, 0.25, 0.25),
fontsize=16, labels=None, sparkline=True, inline=True,
freq=None)
| 102 | |
| 103 | |
| 104 | def matrix(df, |
| 105 | filter=None, n=0, p=0, sort=None, |
| 106 | figsize=(25, 10), width_ratios=(15, 1), color=(0.25, 0.25, 0.25), |
| 107 | fontsize=16, labels=None, sparkline=True, inline=True, |
| 108 | freq=None): |
| 109 | """ |
| 110 | Presents a `matplotlib` matrix visualization of the nullity of the given DataFrame. |
| 111 | |
| 112 | Note that for the default `figsize` 250 is a soft display limit: specifying a number of records greater than |
| 113 | approximately this value will cause certain records to show up in the sparkline but not in the matrix, which can |
| 114 | be confusing. |
| 115 | |
| 116 | |
| 117 | The default vertical display will fit up to 50 columns. If more than 50 columns are specified and the labels |
| 118 | parameter is left unspecified the visualization will automatically drop the labels as they will not be very |
| 119 | readable. You can override this behavior using `labels=True` and your own `fontsize` parameter. |
| 120 | |
| 121 | :param df: The DataFrame whose completeness is being nullity matrix mapped. |
| 122 | :param filter: The filter to apply to the heatmap. Should be one of "top", "bottom", or None (default). See |
| 123 | `nullity_filter()` for more information. |
| 124 | :param n: The cap on the number of columns to include in the filtered DataFrame. See `nullity_filter()` for |
| 125 | more information. |
| 126 | :param p: The cap on the percentage fill of the columns in the filtered DataFrame. See `nullity_filter()` for |
| 127 | more information. |
| 128 | :param sort: The sort to apply to the heatmap. Should be one of "ascending", "descending", or None. See |
| 129 | `nullity_sort()` for more information. |
| 130 | :param figsize: The size of the figure to display. This is a `matplotlib` parameter. |
| 131 | For the vertical configuration this defaults to (20, 10); the horizontal configuration computes a sliding value |
| 132 | by default based on the number of columns that need to be displayed. |
| 133 | :param fontsize: The figure's font size. This default to 16. |
| 134 | :param labels: Whether or not to display the column names. Would need to be turned off on particularly large |
| 135 | displays. Defaults to True. |
| 136 | :param sparkline: Whether or not to display the sparkline. Defaults to True. |
| 137 | :param width_ratios: The ratio of the width of the matrix to the width of the sparkline. Defaults to `(15, |
| 138 | 1)`. Does nothing if `sparkline=False`. |
| 139 | :param color: The color of the filled columns. Default is a medium dark gray: the RGB multiple `(0.25, 0.25, 0.25)`. |
| 140 | :return: If `inline` is True, the underlying `matplotlib.figure` object. Else, nothing. |
| 141 | """ |
| 142 | |
| 143 | # Apply filters and sorts. |
| 144 | df = nullity_filter(df, filter=filter, n=n, p=p) |
| 145 | df = nullity_sort(df, sort=sort) |
| 146 | |
| 147 | height = df.shape[0] |
| 148 | width = df.shape[1] |
| 149 | |
| 150 | # z is the color-mask array. |
| 151 | z = df.notnull().values |
| 152 | |
| 153 | # g is a NxNx3 matrix |
| 154 | g = np.zeros((height, width, 3)) |
| 155 | |
| 156 | # Apply the z color-mask to set the RGB of each pixel. |
| 157 | g[z < 0.5] = [1, 1, 1] |
| 158 | g[z > 0.5] = color |
| 159 | |
| 160 | # Set up the matplotlib grid layout. |
| 161 | # If the sparkline is removed the layout is a unary subplot. |
nothing calls this directly
no test coverage detected