hub / github.com/scikit-learn/scikit-learn / _is_constant_feature

Function _is_constant_feature

sklearn/preprocessing/_data.py:83–98 · view source on GitHub ↗

Detect if a feature is indistinguishable from a constant feature. The detection is based on its computed variance and on the theoretical error bounds of the '2 pass algorithm' for variance computation. See "Algorithms for computing the sample variance: analysis and recommendations"

(var, mean, n_samples)

Source from the content-addressed store, hash-verified

81
82
83	def _is_constant_feature(var, mean, n_samples):
84	"""Detect if a feature is indistinguishable from a constant feature.
85
86	The detection is based on its computed variance and on the theoretical
87	error bounds of the '2 pass algorithm' for variance computation.
88
89	See "Algorithms for computing the sample variance: analysis and
90	recommendations", by Chan, Golub, and LeVeque.
91	"""
92	# In scikit-learn, variance is always computed using float64 accumulators.
93	xp, _, device_ = get_namespace_and_device(var, mean)
94	max_float_dtype = _max_precision_float_dtype(xp=xp, device=device_)
95	eps = xp.finfo(max_float_dtype).eps
96
97	upper_bound = n_samples * eps * var + (n_samples * mean * eps) ** 2
98	return var <= upper_bound
99
100
101	def _handle_zeros_in_scale(scale, copy=True, constant_mask=None):

Callers 2

partial_fitMethod · 0.85

_fitMethod · 0.85

Calls 2

get_namespace_and_deviceFunction · 0.90

_max_precision_float_dtypeFunction · 0.90

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…