MCPcopy Index your code
hub / github.com/scikit-learn/scikit-learn / partial_fit

Method partial_fit

sklearn/preprocessing/_data.py:931–1090  ·  view source on GitHub ↗

Online computation of mean and std on X for later scaling. All of X is processed as a single batch. This is intended for cases when :meth:`fit` is not feasible due to very large number of `n_samples` or because X is read from a continuous stream. The algorithm for i

(self, X, y=None, sample_weight=None)

Source from the content-addressed store, hash-verified

929
930 @_fit_context(prefer_skip_nested_validation=True)
931 def partial_fit(self, X, y=None, sample_weight=None):
932 """Online computation of mean and std on X for later scaling.
933
934 All of X is processed as a single batch. This is intended for cases
935 when :meth:`fit` is not feasible due to very large number of
936 `n_samples` or because X is read from a continuous stream.
937
938 The algorithm for incremental mean and std is given in Equation 1.5a,b
939 in Chan, Tony F., Gene H. Golub, and Randall J. LeVeque. "Algorithms
940 for computing the sample variance: Analysis and recommendations."
941 The American Statistician 37.3 (1983): 242-247:
942
943 Parameters
944 ----------
945 X : {array-like, sparse matrix} of shape (n_samples, n_features)
946 The data used to compute the mean and standard deviation
947 used for later scaling along the features axis.
948
949 y : None
950 Ignored.
951
952 sample_weight : array-like of shape (n_samples,), default=None
953 Individual weights for each sample.
954
955 .. versionadded:: 0.24
956 parameter *sample_weight* support to StandardScaler.
957
958 Returns
959 -------
960 self : object
961 Fitted scaler.
962 """
963 xp, _, X_device = get_namespace_and_device(X)
964 first_call = not hasattr(self, "n_samples_seen_")
965 X = validate_data(
966 self,
967 X,
968 accept_sparse=("csr", "csc"),
969 dtype=supported_float_dtypes(xp, X_device),
970 ensure_all_finite="allow-nan",
971 reset=first_call,
972 )
973 n_features = X.shape[1]
974
975 callback_ctx = self._init_callback_context()
976 callback_ctx.call_on_fit_task_begin(
977 estimator=self, X=X, y=y, metadata={"sample_weight": sample_weight}
978 )
979
980 if sample_weight is not None:
981 sample_weight = _check_sample_weight(sample_weight, X, dtype=X.dtype)
982
983 # Even in the case of `with_mean=False`, we update the mean anyway
984 # This is needed for the incremental computation of the var
985 # See incr_mean_variance_axis and _incremental_mean_variance_axis
986
987 # if n_samples_seen_ is an integer (i.e. no missing values), we need to
988 # transform it to an array of shape (n_features,) required by

Calls 15

get_namespace_and_deviceFunction · 0.90
validate_dataFunction · 0.90
supported_float_dtypesFunction · 0.90
_check_sample_weightFunction · 0.90
sizeFunction · 0.90
mean_variance_axisFunction · 0.90
incr_mean_variance_axisFunction · 0.90
_is_constant_featureFunction · 0.85
_handle_zeros_in_scaleFunction · 0.85