hub / github.com/dask/dask / compression_matrix

Function compression_matrix

dask/array/linalg.py:655–743 · view source on GitHub ↗

Randomly sample matrix to find most active subspace This compression matrix returned by this algorithm can be used to compute both the QR decomposition and the Singular Value Decomposition. Parameters ---------- data: Array q: int Size of the desired subspace (t

(
    data,
    q,
    iterator="power",
    n_power_iter=0,
    n_oversamples=10,
    seed=None,
    compute=False,
)

Source from the content-addressed store, hash-verified

653
654
655	def compression_matrix(
656	data,
657	q,
658	iterator="power",
659	n_power_iter=0,
660	n_oversamples=10,
661	seed=None,
662	compute=False,
663	):
664	"""Randomly sample matrix to find most active subspace
665
666	This compression matrix returned by this algorithm can be used to
667	compute both the QR decomposition and the Singular Value
668	Decomposition.
669
670	Parameters
671	----------
672	data: Array
673	q: int
674	Size of the desired subspace (the actual size will be bigger,
675	because of oversampling, see ``da.linalg.compression_level``)
676	iterator: {'power', 'QR'}, default='power'
677	Define the technique used for iterations to cope with flat
678	singular spectra or when the input matrix is very large.
679	n_power_iter: int
680	Number of power iterations, useful when the singular values
681	decay slowly. Error decreases exponentially as `n_power_iter`
682	increases. In practice, set `n_power_iter` <= 4.
683	n_oversamples: int, default=10
684	Number of oversamples used for generating the sampling matrix.
685	This value increases the size of the subspace computed, which is more
686	accurate at the cost of efficiency. Results are rarely sensitive to this choice
687	though and in practice a value of 10 is very commonly high enough.
688	compute : bool
689	Whether or not to compute data at each use.
690	Recomputing the input while performing several passes reduces memory
691	pressure, but means that we have to compute the input multiple times.
692	This is a good choice if the data is larger than memory and cheap to
693	recreate.
694
695	References
696	----------
697	N. Halko, P. G. Martinsson, and J. A. Tropp.
698	Finding structure with randomness: Probabilistic algorithms for
699	constructing approximate matrix decompositions.
700	SIAM Rev., Survey and Review section, Vol. 53, num. 2,
701	pp. 217-288, June 2011
702	https://arxiv.org/abs/0909.4061
703	"""
704	if iterator not in ["power", "QR"]:
705	raise ValueError(
706	f"Iterator '{iterator}' not valid, must one one of ['power', 'QR']"
707	)
708	m, n = data.shape
709	comp_level = compression_level(min(m, n), q, n_oversamples=n_oversamples)
710	if isinstance(seed, RandomState):
711	state = seed
712	else:

Callers 1

svd_compressedFunction · 0.85

Calls 9

default_rngFunction · 0.90

waitFunction · 0.90

compression_levelFunction · 0.85

minFunction · 0.85

tsqrFunction · 0.85

astypeMethod · 0.45

standard_normalMethod · 0.45

dotMethod · 0.45

persistMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…