MCPcopy Index your code
hub / github.com/scikit-learn/scikit-learn / partial_fit

Method partial_fit

sklearn/preprocessing/_data.py:476–545  ·  view source on GitHub ↗

Online computation of min and max on X for later scaling. All of X is processed as a single batch. This is intended for cases when :meth:`fit` is not feasible due to very large number of `n_samples` or because X is read from a continuous stream. Parameters -

(self, X, y=None)

Source from the content-addressed store, hash-verified

474
475 @_fit_context(prefer_skip_nested_validation=True)
476 def partial_fit(self, X, y=None):
477 """Online computation of min and max on X for later scaling.
478
479 All of X is processed as a single batch. This is intended for cases
480 when :meth:`fit` is not feasible due to very large number of
481 `n_samples` or because X is read from a continuous stream.
482
483 Parameters
484 ----------
485 X : array-like of shape (n_samples, n_features)
486 The data used to compute the mean and standard deviation
487 used for later scaling along the features axis.
488
489 y : None
490 Ignored.
491
492 Returns
493 -------
494 self : object
495 Fitted scaler.
496 """
497 feature_range = self.feature_range
498 if feature_range[0] >= feature_range[1]:
499 raise ValueError(
500 "Minimum of desired feature range must be smaller than maximum. Got %s."
501 % str(feature_range)
502 )
503
504 if sparse.issparse(X):
505 raise TypeError(
506 "MinMaxScaler does not support sparse input. "
507 "Consider using MaxAbsScaler instead."
508 )
509
510 xp, _ = get_namespace(X)
511
512 first_pass = not hasattr(self, "n_samples_seen_")
513 X = validate_data(
514 self,
515 X,
516 reset=first_pass,
517 dtype=_array_api.supported_float_dtypes(xp, device=device(X)),
518 ensure_all_finite="allow-nan",
519 )
520
521 device_ = device(X)
522 feature_range = (
523 xp.asarray(feature_range[0], dtype=X.dtype, device=device_),
524 xp.asarray(feature_range[1], dtype=X.dtype, device=device_),
525 )
526
527 data_min = _array_api._nanmin(X, axis=0, xp=xp)
528 data_max = _array_api._nanmax(X, axis=0, xp=xp)
529
530 if first_pass:
531 self.n_samples_seen_ = X.shape[0]
532 else:
533 data_min = xp.minimum(self.data_min_, data_min)

Calls 4

get_namespaceFunction · 0.90
validate_dataFunction · 0.90
deviceFunction · 0.90
_handle_zeros_in_scaleFunction · 0.85