Set scales of near constant features to 1. The goal is to avoid division by very small or zero values. Near constant features are detected automatically by identifying scales close to machine precision unless they are precomputed by the caller and passed with the `constant_mask` kw
(scale, copy=True, constant_mask=None)
| 99 | |
| 100 | |
| 101 | def _handle_zeros_in_scale(scale, copy=True, constant_mask=None): |
| 102 | """Set scales of near constant features to 1. |
| 103 | |
| 104 | The goal is to avoid division by very small or zero values. |
| 105 | |
| 106 | Near constant features are detected automatically by identifying |
| 107 | scales close to machine precision unless they are precomputed by |
| 108 | the caller and passed with the `constant_mask` kwarg. |
| 109 | |
| 110 | Typically for standard scaling, the scales are the standard |
| 111 | deviation while near constant features are better detected on the |
| 112 | computed variances which are closer to machine precision by |
| 113 | construction. |
| 114 | """ |
| 115 | # if we are fitting on 1D arrays, scale might be a scalar |
| 116 | if np.isscalar(scale): |
| 117 | if scale == 0.0: |
| 118 | scale = 1.0 |
| 119 | return scale |
| 120 | # scale is an array |
| 121 | else: |
| 122 | xp, _ = get_namespace(scale) |
| 123 | if constant_mask is None: |
| 124 | # Detect near constant values to avoid dividing by a very small |
| 125 | # value that could lead to surprising results and numerical |
| 126 | # stability issues. |
| 127 | constant_mask = scale < 10 * xp.finfo(scale.dtype).eps |
| 128 | |
| 129 | if copy: |
| 130 | # New array to avoid side-effects |
| 131 | scale = xp.asarray(scale, copy=True) |
| 132 | scale[constant_mask] = 1.0 |
| 133 | return scale |
| 134 | |
| 135 | |
| 136 | @validate_params( |
searching dependent graphs…