Standardize a dataset along any axis. Center to the mean and component wise scale to unit variance. Read more in the :ref:`User Guide `. Parameters ---------- X : {array-like, sparse matrix} of shape (n_samples, n_features) The data to center and
(X, *, axis=0, with_mean=True, with_std=True, copy=True)
| 144 | prefer_skip_nested_validation=True, |
| 145 | ) |
| 146 | def scale(X, *, axis=0, with_mean=True, with_std=True, copy=True): |
| 147 | """Standardize a dataset along any axis. |
| 148 | |
| 149 | Center to the mean and component wise scale to unit variance. |
| 150 | |
| 151 | Read more in the :ref:`User Guide <preprocessing_scaler>`. |
| 152 | |
| 153 | Parameters |
| 154 | ---------- |
| 155 | X : {array-like, sparse matrix} of shape (n_samples, n_features) |
| 156 | The data to center and scale. |
| 157 | |
| 158 | axis : {0, 1}, default=0 |
| 159 | Axis used to compute the means and standard deviations along. If 0, |
| 160 | independently standardize each feature, otherwise (if 1) standardize |
| 161 | each sample. |
| 162 | |
| 163 | with_mean : bool, default=True |
| 164 | If True, center the data before scaling. |
| 165 | |
| 166 | with_std : bool, default=True |
| 167 | If True, scale the data to unit variance (or equivalently, |
| 168 | unit standard deviation). |
| 169 | |
| 170 | copy : bool, default=True |
| 171 | If False, try to avoid a copy and scale in place. |
| 172 | This is not guaranteed to always work in place; e.g. if the data is |
| 173 | a numpy array with an int dtype, a copy will be returned even with |
| 174 | copy=False. |
| 175 | |
| 176 | Returns |
| 177 | ------- |
| 178 | X_tr : {ndarray, sparse matrix} of shape (n_samples, n_features) |
| 179 | The transformed data. |
| 180 | |
| 181 | See Also |
| 182 | -------- |
| 183 | StandardScaler : Performs scaling to unit variance using the Transformer |
| 184 | API (e.g. as part of a preprocessing |
| 185 | :class:`~sklearn.pipeline.Pipeline`). |
| 186 | |
| 187 | Notes |
| 188 | ----- |
| 189 | This implementation will refuse to center scipy.sparse matrices |
| 190 | since it would make them non-sparse and would potentially crash the |
| 191 | program with memory exhaustion problems. |
| 192 | |
| 193 | Instead the caller is expected to either set explicitly |
| 194 | `with_mean=False` (in that case, only variance scaling will be |
| 195 | performed on the features of the CSC matrix) or to call `X.toarray()` |
| 196 | if he/she expects the materialized dense array to fit in memory. |
| 197 | |
| 198 | To avoid memory copy the caller should pass a CSC matrix. |
| 199 | |
| 200 | NaNs are treated as missing values: disregarded to compute the statistics, |
| 201 | and maintained during the data transformation. |
| 202 | |
| 203 | We use a biased estimator for the standard deviation, equivalent to |
searching dependent graphs…