hub / github.com/scikit-learn/scikit-learn / resample

Function resample

sklearn/utils/_indexing.py:428–612 · view source on GitHub ↗

Resample arrays or sparse matrices in a consistent way. The default strategy implements one step of the bootstrapping procedure. Parameters ---------- *arrays : sequence of array-like of shape (n_samples,) or \ (n_samples, n_outputs) Indexable data-structure

(
    *arrays,
    replace=True,
    n_samples=None,
    random_state=None,
    stratify=None,
    sample_weight=None,
)

Source from the content-addressed store, hash-verified

426	prefer_skip_nested_validation=True,
427	)
428	def resample(
429	*arrays,
430	replace=True,
431	n_samples=None,
432	random_state=None,
433	stratify=None,
434	sample_weight=None,
435	):
436	"""Resample arrays or sparse matrices in a consistent way.
437
438	The default strategy implements one step of the bootstrapping
439	procedure.
440
441	Parameters
442	----------
443	*arrays : sequence of array-like of shape (n_samples,) or \
444	(n_samples, n_outputs)
445	Indexable data-structures can be arrays, lists, dataframes or scipy
446	sparse matrices with consistent first dimension.
447
448	replace : bool, default=True
449	Implements resampling with replacement. It must be set to True
450	whenever sampling with non-uniform weights: a few data points with very large
451	weights are expected to be sampled several times with probability to preserve
452	the distribution induced by the weights. If False, this will implement
453	(sliced) random permutations.
454
455	n_samples : int, default=None
456	Number of samples to generate. If left to None this is
457	automatically set to the first dimension of the arrays.
458	If replace is False it should not be larger than the length of
459	arrays.
460
461	random_state : int, RandomState instance or None, default=None
462	Determines random number generation for shuffling
463	the data.
464	Pass an int for reproducible results across multiple function calls.
465	See :term:`Glossary <random_state>`.
466
467	stratify : {array-like, sparse matrix} of shape (n_samples,) or \
468	(n_samples, n_outputs), default=None
469	If not None, data is split in a stratified fashion, using this as
470	the class labels.
471
472	sample_weight : array-like of shape (n_samples,), default=None
473	Contains weight values to be associated with each sample. Values are
474	normalized to sum to one and interpreted as probability for sampling
475	each data point.
476
477	.. versionadded:: 1.7
478
479	Returns
480	-------
481	resampled_arrays : sequence of array-like of shape (n_samples,) or \
482	(n_samples, n_outputs)
483	Sequence of resampled copies of the collections. The original arrays
484	are not impacted.
485

Callers 12

splitMethod · 0.90

test_resampleFunction · 0.90

test_resample_weightedFunction · 0.90

test_resample_stratifiedFunction · 0.90

test_resample_stratified_replaceFunction · 0.90

test_resample_stratify_2dyFunction · 0.90

test_notimplementederrorFunction · 0.90

test_resample_stratify_sparse_errorFunction · 0.90

fitMethod · 0.90

_dense_fitMethod · 0.90

_get_small_trainsetMethod · 0.90

shuffleFunction · 0.85

Calls 7

check_random_stateFunction · 0.90

check_consistent_lengthFunction · 0.90

_check_sample_weightFunction · 0.90

check_arrayFunction · 0.90

_approximate_modeFunction · 0.90

_safe_indexingFunction · 0.85

splitMethod · 0.45

Tested by 7

test_resampleFunction · 0.72

test_resample_weightedFunction · 0.72

test_resample_stratifiedFunction · 0.72

test_resample_stratified_replaceFunction · 0.72

test_resample_stratify_2dyFunction · 0.72

test_notimplementederrorFunction · 0.72

test_resample_stratify_sparse_errorFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…