Online computation of mean and std on X for later scaling. All of X is processed as a single batch. This is intended for cases when :meth:`fit` is not feasible due to very large number of `n_samples` or because X is read from a continuous stream. The algorithm for i
(self, X, y=None, sample_weight=None)
| 929 | |
| 930 | @_fit_context(prefer_skip_nested_validation=True) |
| 931 | def partial_fit(self, X, y=None, sample_weight=None): |
| 932 | """Online computation of mean and std on X for later scaling. |
| 933 | |
| 934 | All of X is processed as a single batch. This is intended for cases |
| 935 | when :meth:`fit` is not feasible due to very large number of |
| 936 | `n_samples` or because X is read from a continuous stream. |
| 937 | |
| 938 | The algorithm for incremental mean and std is given in Equation 1.5a,b |
| 939 | in Chan, Tony F., Gene H. Golub, and Randall J. LeVeque. "Algorithms |
| 940 | for computing the sample variance: Analysis and recommendations." |
| 941 | The American Statistician 37.3 (1983): 242-247: |
| 942 | |
| 943 | Parameters |
| 944 | ---------- |
| 945 | X : {array-like, sparse matrix} of shape (n_samples, n_features) |
| 946 | The data used to compute the mean and standard deviation |
| 947 | used for later scaling along the features axis. |
| 948 | |
| 949 | y : None |
| 950 | Ignored. |
| 951 | |
| 952 | sample_weight : array-like of shape (n_samples,), default=None |
| 953 | Individual weights for each sample. |
| 954 | |
| 955 | .. versionadded:: 0.24 |
| 956 | parameter *sample_weight* support to StandardScaler. |
| 957 | |
| 958 | Returns |
| 959 | ------- |
| 960 | self : object |
| 961 | Fitted scaler. |
| 962 | """ |
| 963 | xp, _, X_device = get_namespace_and_device(X) |
| 964 | first_call = not hasattr(self, "n_samples_seen_") |
| 965 | X = validate_data( |
| 966 | self, |
| 967 | X, |
| 968 | accept_sparse=("csr", "csc"), |
| 969 | dtype=supported_float_dtypes(xp, X_device), |
| 970 | ensure_all_finite="allow-nan", |
| 971 | reset=first_call, |
| 972 | ) |
| 973 | n_features = X.shape[1] |
| 974 | |
| 975 | callback_ctx = self._init_callback_context() |
| 976 | callback_ctx.call_on_fit_task_begin( |
| 977 | estimator=self, X=X, y=y, metadata={"sample_weight": sample_weight} |
| 978 | ) |
| 979 | |
| 980 | if sample_weight is not None: |
| 981 | sample_weight = _check_sample_weight(sample_weight, X, dtype=X.dtype) |
| 982 | |
| 983 | # Even in the case of `with_mean=False`, we update the mean anyway |
| 984 | # This is needed for the incremental computation of the var |
| 985 | # See incr_mean_variance_axis and _incremental_mean_variance_axis |
| 986 | |
| 987 | # if n_samples_seen_ is an integer (i.e. no missing values), we need to |
| 988 | # transform it to an array of shape (n_features,) required by |