hub / github.com/scikit-learn/scikit-learn / split

Method split

sklearn/model_selection/_split.py:849–889 · view source on GitHub ↗

Generate indices to split data into training and test set. Parameters ---------- X : array-like of shape (n_samples, n_features) Training data, where `n_samples` is the number of samples and `n_features` is the number of features. Note th

(self, X, y, groups=None)

Source from the content-addressed store, hash-verified

847	yield test_folds == i
848
849	def split(self, X, y, groups=None):
850	"""Generate indices to split data into training and test set.
851
852	Parameters
853	----------
854	X : array-like of shape (n_samples, n_features)
855	Training data, where `n_samples` is the number of samples
856	and `n_features` is the number of features.
857
858	Note that providing ``y`` is sufficient to generate the splits and
859	hence ``np.zeros(n_samples)`` may be used as a placeholder for
860	``X`` instead of actual training data.
861
862	y : array-like of shape (n_samples,)
863	The target variable for supervised learning problems.
864	Stratification is done based on the y labels.
865
866	groups : array-like of shape (n_samples,), default=None
867	Always ignored, exists for API compatibility.
868
869	Yields
870	------
871	train : ndarray
872	The training set indices for that split.
873
874	test : ndarray
875	The testing set indices for that split.
876
877	Notes
878	-----
879	Randomized CV splitters may return different results for each call of
880	split. You can make the results identical by setting `random_state`
881	to an integer.
882	"""
883	if groups is not None:
884	warnings.warn(
885	f"The groups parameter is ignored by {self.__class__.__name__}",
886	UserWarning,
887	)
888	y = check_array(y, input_name="y", ensure_2d=False, dtype=None)
889	return super().split(X, y, groups)
890
891
892	class StratifiedGroupKFold(GroupsConsumerMixin, _BaseKFold):

Callers 6

test_cross_val_predict_unbalancedFunction · 0.95

test_grid_search_correct_score_resultsFunction · 0.95

test_kfold_valueerrorsFunction · 0.95

test_shuffle_stratifiedkfoldFunction · 0.95

test_encoding_multiclassFunction · 0.95

test_LogisticRegressionCV_on_foldsFunction · 0.95

Calls 2

check_arrayFunction · 0.90

splitMethod · 0.45

Tested by 6

test_cross_val_predict_unbalancedFunction · 0.76

test_grid_search_correct_score_resultsFunction · 0.76

test_kfold_valueerrorsFunction · 0.76

test_shuffle_stratifiedkfoldFunction · 0.76

test_encoding_multiclassFunction · 0.76

test_LogisticRegressionCV_on_foldsFunction · 0.76