MCPcopy
hub / github.com/dask/dask / topk

Function topk

dask/array/reductions.py:1347–1403  ·  view source on GitHub ↗

Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest. This performs best when ``k`` is much smaller than the chunk size. A

(a, k, axis=-1, split_every=None)

Source from the content-addressed store, hash-verified

1345
1346
1347def topk(a, k, axis=-1, split_every=None):
1348 """Extract the k largest elements from a on the given axis,
1349 and return them sorted from largest to smallest.
1350 If k is negative, extract the -k smallest elements instead,
1351 and return them sorted from smallest to largest.
1352
1353 This performs best when ``k`` is much smaller than the chunk size. All
1354 results will be returned in a single chunk along the given axis.
1355
1356 Parameters
1357 ----------
1358 x: Array
1359 Data being sorted
1360 k: int
1361 axis: int, optional
1362 split_every: int >=2, optional
1363 See :func:`reduce`. This parameter becomes very important when k is
1364 on the same order of magnitude of the chunk size or more, as it
1365 prevents getting the whole or a significant portion of the input array
1366 in memory all at once, with a negative impact on network transfer
1367 too when running on distributed.
1368
1369 Returns
1370 -------
1371 Selection of x with size abs(k) along the given axis.
1372
1373 Examples
1374 --------
1375 >>> import dask.array as da
1376 >>> x = np.array([5, 1, 3, 6])
1377 >>> d = da.from_array(x, chunks=2)
1378 >>> d.topk(2).compute()
1379 array([6, 5])
1380 >>> d.topk(-2).compute()
1381 array([1, 3])
1382 """
1383 axis = validate_axis(axis, a.ndim)
1384
1385 # chunk and combine steps of the reduction, which recursively invoke
1386 # np.partition to pick the top/bottom k elements from the previous step.
1387 # The selection is not sorted internally.
1388 chunk_combine = partial(chunk.topk, k=k)
1389 # aggregate step of the reduction. Internally invokes the chunk/combine
1390 # function, then sorts the results internally.
1391 aggregate = partial(chunk.topk_aggregate, k=k)
1392
1393 return reduction(
1394 a,
1395 chunk=chunk_combine,
1396 combine=chunk_combine,
1397 aggregate=aggregate,
1398 axis=axis,
1399 keepdims=True,
1400 dtype=a.dtype,
1401 split_every=split_every,
1402 output_size=abs(k),
1403 )
1404

Callers 1

topkMethod · 0.90

Calls 2

validate_axisFunction · 0.90
reductionFunction · 0.90

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…