hub / github.com/dask/dask / _custom_nanquantile

Function _custom_nanquantile

dask/array/reductions.py:1660–1742 · view source on GitHub ↗

(
    a,
    q,
    axis=None,
    method="linear",
    keepdims=False,
    weights=None,
    **kwargs,
)

Source from the content-addressed store, hash-verified

1658
1659
1660	def _custom_nanquantile(
1661	a,
1662	q,
1663	axis=None,
1664	method="linear",
1665	keepdims=False,
1666	weights=None,
1667	**kwargs,
1668	):
1669	if (
1670	method != "linear"
1671	or len(axis) != 1
1672	or axis[0] != len(a.shape) - 1
1673	or len(a.shape) == 1
1674	or a.shape[-1] > 1000
1675	or q.ndim > 1
1676	or weights is not None
1677	):
1678	# bail to nanquantile. Assumptions are pretty strict for now but we
1679	# do cover the xarray.quantile case.
1680	if weights is not None and weights.ndim != a.ndim:
1681	# np.nanquantile requires weights to have the same shape as a,
1682	# unlike np.quantile which broadcasts 1-D weights automatically.
1683	expand_at = [i for i in range(a.ndim) if i not in axis]
1684	weights = np.broadcast_to(np.expand_dims(weights, axis=expand_at), a.shape)
1685	# NumPy <2.0 doesn't support the weights parameter
1686	kwargs["weights"] = weights
1687
1688	return np.nanquantile(
1689	a,
1690	q,
1691	axis=axis,
1692	method=method,
1693	keepdims=keepdims,
1694	**kwargs,
1695	)
1696	# nanquantile in NumPy is pretty slow if the quantile axis is slow because
1697	# each quantile has overhead.
1698	# This method works around this by calculating the quantile manually.
1699	# Steps:
1700	# 1. Sort the array along the quantile axis (this is the most expensive step
1701	# 2. Calculate which positions are the quantile positions
1702	# (respecting NaN values, so each quantile can have a different indexer)
1703	# 3. Get the neighboring values of the quantile positions
1704	# 4. Perform linear interpolation between the neighboring values
1705	#
1706	# The main advantage is that we get rid of the overhead, removing GIL blockage
1707	# and just generally making things faster.
1708
1709	sorted_arr = np.sort(a, axis=-1)
1710	indexers = _span_indexers(a)
1711	nr_quantiles = len(indexers[0])
1712
1713	is_scalar = q.ndim == 0
1714	if is_scalar:
1715	q = q.reshape((1,))
1716
1717	quantiles = []

Callers

nothing calls this directly

Calls 7

_span_indexersFunction · 0.85

reshapeMethod · 0.80

onesMethod · 0.45

sumMethod · 0.45

astypeMethod · 0.45

whereMethod · 0.45

squeezeMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…