Function topk

dask/array/reductions.py:1347–1403 · view source on GitHub ↗

Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest. This performs best when ``k`` is much smaller than the chunk size. A

(a, k, axis=-1, split_every=None)

Source from the content-addressed store, hash-verified

1345
1346
1347	def topk(a, k, axis=-1, split_every=None):
1348	"""Extract the k largest elements from a on the given axis,
1349	and return them sorted from largest to smallest.
1350	If k is negative, extract the -k smallest elements instead,
1351	and return them sorted from smallest to largest.
1352
1353	This performs best when ``k`` is much smaller than the chunk size. All
1354	results will be returned in a single chunk along the given axis.
1355
1356	Parameters
1357	----------
1358	x: Array
1359	Data being sorted
1360	k: int
1361	axis: int, optional
1362	split_every: int >=2, optional
1363	See :func:`reduce`. This parameter becomes very important when k is
1364	on the same order of magnitude of the chunk size or more, as it
1365	prevents getting the whole or a significant portion of the input array
1366	in memory all at once, with a negative impact on network transfer
1367	too when running on distributed.
1368
1369	Returns
1370	-------
1371	Selection of x with size abs(k) along the given axis.
1372
1373	Examples
1374	--------
1375	>>> import dask.array as da
1376	>>> x = np.array([5, 1, 3, 6])
1377	>>> d = da.from_array(x, chunks=2)
1378	>>> d.topk(2).compute()
1379	array([6, 5])
1380	>>> d.topk(-2).compute()
1381	array([1, 3])
1382	"""
1383	axis = validate_axis(axis, a.ndim)
1384
1385	# chunk and combine steps of the reduction, which recursively invoke
1386	# np.partition to pick the top/bottom k elements from the previous step.
1387	# The selection is not sorted internally.
1388	chunk_combine = partial(chunk.topk, k=k)
1389	# aggregate step of the reduction. Internally invokes the chunk/combine
1390	# function, then sorts the results internally.
1391	aggregate = partial(chunk.topk_aggregate, k=k)
1392
1393	return reduction(
1394	a,
1395	chunk=chunk_combine,
1396	combine=chunk_combine,
1397	aggregate=aggregate,
1398	axis=axis,
1399	keepdims=True,
1400	dtype=a.dtype,
1401	split_every=split_every,
1402	output_size=abs(k),
1403	)
1404

Callers 1

topkMethod · 0.90

Calls 2

validate_axisFunction · 0.90

reductionFunction · 0.90

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…