hub / github.com/scikit-learn/scikit-learn / check_X_y

Function check_X_y

sklearn/utils/validation.py:1183–1348 · view source on GitHub ↗

Input validation for standard estimators. Checks X and y for consistent length, enforces X to be 2D and y 1D. By default, X is checked to be non-empty and containing only finite values. Standard input checks are also applied to y, such as checking that y does not have np.nan or np.i

(
    X,
    y,
    accept_sparse=False,
    *,
    accept_large_sparse=True,
    dtype="numeric",
    order=None,
    copy=False,
    force_writeable=False,
    ensure_all_finite=True,
    ensure_2d=True,
    allow_nd=False,
    multi_output=False,
    ensure_min_samples=1,
    ensure_min_features=1,
    y_numeric=False,
    estimator=None,
)

Source from the content-addressed store, hash-verified

1181
1182
1183	def check_X_y(
1184	X,
1185	y,
1186	accept_sparse=False,
1187	*,
1188	accept_large_sparse=True,
1189	dtype="numeric",
1190	order=None,
1191	copy=False,
1192	force_writeable=False,
1193	ensure_all_finite=True,
1194	ensure_2d=True,
1195	allow_nd=False,
1196	multi_output=False,
1197	ensure_min_samples=1,
1198	ensure_min_features=1,
1199	y_numeric=False,
1200	estimator=None,
1201	):
1202	"""Input validation for standard estimators.
1203
1204	Checks X and y for consistent length, enforces X to be 2D and y 1D. By
1205	default, X is checked to be non-empty and containing only finite values.
1206	Standard input checks are also applied to y, such as checking that y
1207	does not have np.nan or np.inf targets. For multi-label y, set
1208	multi_output=True to allow 2D and sparse y. If the dtype of X is
1209	object, attempt converting to float, raising on failure.
1210
1211	Parameters
1212	----------
1213	X : {ndarray, list, sparse matrix}
1214	Input data.
1215
1216	y : {ndarray, list, sparse matrix}
1217	Labels.
1218
1219	accept_sparse : str, bool or list of str, default=False
1220	String[s] representing allowed sparse matrix formats, such as 'csc',
1221	'csr', etc. If the input is sparse but not in the allowed format,
1222	it will be converted to the first listed format. True allows the input
1223	to be any format. False means that a sparse matrix input will
1224	raise an error.
1225
1226	accept_large_sparse : bool, default=True
1227	If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by
1228	accept_sparse, accept_large_sparse will cause it to be accepted only
1229	if its indices are stored with a 32-bit dtype.
1230
1231	.. versionadded:: 0.20
1232
1233	dtype : 'numeric', type, list of type or None, default='numeric'
1234	Data type of result. If None, the dtype of the input is preserved.
1235	If "numeric", dtype is preserved unless array.dtype is object.
1236	If dtype is a list of types, conversion on the first type is only
1237	performed if the dtype of the input is not in the list.
1238
1239	order : {'F', 'C'}, default=None
1240	Whether an array will be forced to be fortran or c-style. If

Callers 15

fitMethod · 0.90

test_check_array_min_samples_and_features_messagesFunction · 0.90

test_check_X_y_informative_errorFunction · 0.90

_estimate_miFunction · 0.90

f_classifFunction · 0.90

r_regressionFunction · 0.90

silhouette_scoreFunction · 0.90

silhouette_samplesFunction · 0.90

Calls 4

_check_estimator_nameFunction · 0.85

check_arrayFunction · 0.85

_check_yFunction · 0.85

check_consistent_lengthFunction · 0.85

Tested by 7

fitMethod · 0.72

test_check_array_min_samples_and_features_messagesFunction · 0.72

test_check_X_y_informative_errorFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…