hub / github.com/scikit-learn/scikit-learn / StandardScaler

Class StandardScaler

sklearn/preprocessing/_data.py:742–1187 · view source on GitHub ↗

Standardize features by removing the mean and scaling to unit variance. The standard score of a sample `x` is calculated as: .. code-block:: text z = (x - u) / s where `u` is the mean of the training samples or zero if `with_mean=False`, and `s` is the standard deviation

Source from the content-addressed store, hash-verified

740
741
742	class StandardScaler(
743	CallbackSupportMixin, OneToOneFeatureMixin, TransformerMixin, BaseEstimator
744	):
745	"""Standardize features by removing the mean and scaling to unit variance.
746
747	The standard score of a sample `x` is calculated as:
748
749	.. code-block:: text
750
751	z = (x - u) / s
752
753	where `u` is the mean of the training samples or zero if `with_mean=False`,
754	and `s` is the standard deviation of the training samples or one if
755	`with_std=False`.
756
757	Centering and scaling happen independently on each feature by computing
758	the relevant statistics on the samples in the training set. Mean and
759	standard deviation are then stored to be used on later data using
760	:meth:`transform`.
761
762	Standardization of a dataset is a common requirement for many
763	machine learning estimators: they might behave badly if the
764	individual features do not more or less look like standard normally
765	distributed data (e.g. Gaussian with 0 mean and unit variance).
766
767	For instance many elements used in the objective function of
768	a learning algorithm (such as the RBF kernel of Support Vector
769	Machines or the L1 and L2 regularizers of linear models) assume that
770	all features are centered around 0 and have variance in the same
771	order. If a feature has a variance that is orders of magnitude larger
772	than others, it might dominate the objective function and make the
773	estimator unable to learn from other features correctly as expected.
774
775	`StandardScaler` is sensitive to outliers, and the features may scale
776	differently from each other in the presence of outliers. For an example
777	visualization, refer to :ref:`Compare StandardScaler with other scalers
778	<plot_all_scaling_standard_scaler_section>`.
779
780	This scaler can also be applied to sparse CSR or CSC matrices by passing
781	`with_mean=False` to avoid breaking the sparsity structure of the data.
782
783	Read more in the :ref:`User Guide <preprocessing_scaler>`.
784
785	Parameters
786	----------
787	copy : bool, default=True
788	If False, try to avoid a copy and do inplace scaling instead.
789	This is not guaranteed to always work inplace; e.g. if the data is
790	not a NumPy array or scipy.sparse CSR matrix, a copy may still be
791	returned.
792
793	with_mean : bool, default=True
794	If True, center the data before scaling.
795	This does not work (and will raise an exception) when attempted on
796	sparse matrices, because centering them entails building a dense
797	matrix which in common use cases is likely to be too large to fit in
798	memory.
799

Callers 15

_synth_regression_datasetFunction · 0.90

_synth_classification_datasetFunction · 0.90

_covFunction · 0.90

test_tuned_threshold_classifier_without_constraint_valueFunction · 0.90

test_tuned_threshold_classifier_metric_with_parameterFunction · 0.90

test_tuned_threshold_classifier_with_string_targetsFunction · 0.90

test_tuned_threshold_classifier_cv_zeros_sample_weights_equivalenceFunction · 0.90

test_search_html_reprFunction · 0.90

test_kde_pipeline_gridsearchFunction · 0.90

_regression_datasetFunction · 0.90

check_transformer_generalFunction · 0.90

check_transformer_data_not_an_arrayFunction · 0.90

Calls

no outgoing calls

Tested by 15

test_tuned_threshold_classifier_without_constraint_valueFunction · 0.72

test_tuned_threshold_classifier_metric_with_parameterFunction · 0.72

test_tuned_threshold_classifier_with_string_targetsFunction · 0.72

test_tuned_threshold_classifier_cv_zeros_sample_weights_equivalenceFunction · 0.72

test_search_html_reprFunction · 0.72

test_kde_pipeline_gridsearchFunction · 0.72

test_features_html_with_pipelineFunction · 0.72

test_meta_estimator_output_featuresFunction · 0.72

test_show_arrow_pipelineFunction · 0.72

test_estimator_with_set_outputFunction · 0.72

test__container_error_validationFunction · 0.72

test_pipeline_supportFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…