hub / github.com/scikit-learn/scikit-learn / split

Method split

sklearn/model_selection/_split.py:377–412 · view source on GitHub ↗

Generate indices to split data into training and test set. Parameters ---------- X : array-like of shape (n_samples, n_features) Training data, where `n_samples` is the number of samples and `n_features` is the number of features. y : array-l

(self, X, y=None, groups=None)

Source from the content-addressed store, hash-verified

375	self.random_state = random_state
376
377	def split(self, X, y=None, groups=None):
378	"""Generate indices to split data into training and test set.
379
380	Parameters
381	----------
382	X : array-like of shape (n_samples, n_features)
383	Training data, where `n_samples` is the number of samples
384	and `n_features` is the number of features.
385
386	y : array-like of shape (n_samples,), default=None
387	The target variable for supervised learning problems.
388
389	groups : array-like of shape (n_samples,), default=None
390	Group labels for the samples used while splitting the dataset into
391	train/test set.
392
393	Yields
394	------
395	train : ndarray
396	The training set indices for that split.
397
398	test : ndarray
399	The testing set indices for that split.
400	"""
401	X, y, groups = indexable(X, y, groups)
402	n_samples = _num_samples(X)
403	if self.n_splits > n_samples:
404	raise ValueError(
405	(
406	"Cannot have number of splits n_splits={0} greater"
407	" than the number of samples: n_samples={1}."
408	).format(self.n_splits, n_samples)
409	)
410
411	for train, test in super().split(X, y, groups):
412	yield train, test
413
414	def get_n_splits(self, X=None, y=None, groups=None):
415	"""Returns the number of splitting iterations as set with the `n_splits` param

Callers 15

cross_validateFunction · 0.45

cross_val_predictFunction · 0.45

_permutation_test_scoreFunction · 0.45

learning_curveFunction · 0.45

_incremental_fit_estimatorFunction · 0.45

validation_curveFunction · 0.45

evaluate_candidatesMethod · 0.45

_storeMethod · 0.45

splitMethod · 0.45

Calls 3

indexableFunction · 0.90

_num_samplesFunction · 0.90

formatMethod · 0.80

Tested by 2

test_min_grad_normFunction · 0.36

test_accessible_kl_divergenceFunction · 0.36