hub / github.com/scikit-learn/scikit-learn / partial_fit

Method partial_fit

sklearn/multiclass.py:402–480 · view source on GitHub ↗

Partially fit underlying estimators. Should be used when memory is inefficient to train all data. Chunks of data can be passed in several iterations. Parameters ---------- X : {array-like, sparse matrix} of shape (n_samples, n_features) Data.

(self, X, y, classes=None, **partial_fit_params)

Source from the content-addressed store, hash-verified

400	prefer_skip_nested_validation=False
401	)
402	def partial_fit(self, X, y, classes=None, **partial_fit_params):
403	"""Partially fit underlying estimators.
404
405	Should be used when memory is inefficient to train all data.
406	Chunks of data can be passed in several iterations.
407
408	Parameters
409	----------
410	X : {array-like, sparse matrix} of shape (n_samples, n_features)
411	Data.
412
413	y : {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)
414	Multi-class targets. An indicator matrix turns on multilabel
415	classification.
416
417	classes : array, shape (n_classes, )
418	Classes across all calls to partial_fit.
419	Can be obtained via `np.unique(y_all)`, where y_all is the
420	target vector of the entire dataset.
421	This argument is only required in the first call of partial_fit
422	and can be omitted in the subsequent calls.
423
424	**partial_fit_params : dict
425	Parameters passed to the ``estimator.partial_fit`` method of each
426	sub-estimator.
427
428	.. versionadded:: 1.4
429	Only available if `enable_metadata_routing=True`. See
430	:ref:`Metadata Routing User Guide <metadata_routing>` for more
431	details.
432
433	Returns
434	-------
435	self : object
436	Instance of partially fitted estimator.
437	"""
438	_raise_for_params(partial_fit_params, self, "partial_fit")
439
440	routed_params = process_routing(
441	self,
442	"partial_fit",
443	**partial_fit_params,
444	)
445
446	if _check_partial_fit_first_call(self, classes):
447	self.estimators_ = [clone(self.estimator) for _ in range(self.n_classes_)]
448
449	# A sparse LabelBinarizer, with sparse_output=True, has been
450	# shown to outperform or match a dense label binarizer in all
451	# cases and has also resulted in less or equal memory consumption
452	# in the fit_ovr function overall.
453	self.label_binarizer_ = LabelBinarizer(sparse_output=True)
454	self.label_binarizer_.fit(self.classes_)
455
456	if len(np.setdiff1d(y, self.classes_)):
457	raise ValueError(
458	(
459	"Mini-batch contains {0} while classes " + "must be subset of {1}"

Callers 3

test_ovr_partial_fitFunction · 0.95

test_ovr_partial_fit_exceptionsFunction · 0.95

test_multiclass_estimator_attribute_errorFunction · 0.95

Calls 10

_check_partial_fit_first_callFunction · 0.90

cloneFunction · 0.90

LabelBinarizerClass · 0.90

ParallelClass · 0.90

delayedFunction · 0.90

_raise_for_paramsFunction · 0.85

process_routingFunction · 0.85

formatMethod · 0.80

fitMethod · 0.45

transformMethod · 0.45

Tested by 3

test_ovr_partial_fitFunction · 0.76

test_ovr_partial_fit_exceptionsFunction · 0.76

test_multiclass_estimator_attribute_errorFunction · 0.76