Generate indices to split data into training and test set. Parameters ---------- X : array-like of shape (n_samples, n_features) Training data, where `n_samples` is the number of samples and `n_features` is the number of features. Note th
(self, X, y, groups=None)
| 847 | yield test_folds == i |
| 848 | |
| 849 | def split(self, X, y, groups=None): |
| 850 | """Generate indices to split data into training and test set. |
| 851 | |
| 852 | Parameters |
| 853 | ---------- |
| 854 | X : array-like of shape (n_samples, n_features) |
| 855 | Training data, where `n_samples` is the number of samples |
| 856 | and `n_features` is the number of features. |
| 857 | |
| 858 | Note that providing ``y`` is sufficient to generate the splits and |
| 859 | hence ``np.zeros(n_samples)`` may be used as a placeholder for |
| 860 | ``X`` instead of actual training data. |
| 861 | |
| 862 | y : array-like of shape (n_samples,) |
| 863 | The target variable for supervised learning problems. |
| 864 | Stratification is done based on the y labels. |
| 865 | |
| 866 | groups : array-like of shape (n_samples,), default=None |
| 867 | Always ignored, exists for API compatibility. |
| 868 | |
| 869 | Yields |
| 870 | ------ |
| 871 | train : ndarray |
| 872 | The training set indices for that split. |
| 873 | |
| 874 | test : ndarray |
| 875 | The testing set indices for that split. |
| 876 | |
| 877 | Notes |
| 878 | ----- |
| 879 | Randomized CV splitters may return different results for each call of |
| 880 | split. You can make the results identical by setting `random_state` |
| 881 | to an integer. |
| 882 | """ |
| 883 | if groups is not None: |
| 884 | warnings.warn( |
| 885 | f"The groups parameter is ignored by {self.__class__.__name__}", |
| 886 | UserWarning, |
| 887 | ) |
| 888 | y = check_array(y, input_name="y", ensure_2d=False, dtype=None) |
| 889 | return super().split(X, y, groups) |
| 890 | |
| 891 | |
| 892 | class StratifiedGroupKFold(GroupsConsumerMixin, _BaseKFold): |