MCPcopy
hub / github.com/dask/dask / take

Function take

dask/array/slicing.py:582–648  ·  view source on GitHub ↗

Index array with an iterable of index Handles a single index by a single list Mimics ``np.take`` >>> from pprint import pprint >>> chunks, dsk = take('y', 'x', [(20, 20, 20, 20)], [5, 1, 47, 3], axis=0) >>> chunks ((4,),) When list is sorted we still try to preserve p

(outname, inname, chunks, index, axis=0)

Source from the content-addressed store, hash-verified

580
581
582def take(outname, inname, chunks, index, axis=0):
583 """Index array with an iterable of index
584
585 Handles a single index by a single list
586
587 Mimics ``np.take``
588
589 >>> from pprint import pprint
590 >>> chunks, dsk = take('y', 'x', [(20, 20, 20, 20)], [5, 1, 47, 3], axis=0)
591 >>> chunks
592 ((4,),)
593
594 When list is sorted we still try to preserve properly sized chunks.
595
596 >>> chunks, dsk = take('y', 'x', [(20, 20, 20, 20)], [1, 3, 5, 47], axis=0)
597 >>> chunks
598 ((4,),)
599
600 """
601
602 if not np.isnan(chunks[axis]).any():
603 from dask.array._shuffle import _shuffle
604 from dask.array.utils import asarray_safe
605
606 # verify if this is a full arange (the equivalent of `slice(None)`)
607 full_length = sum(chunks[axis])
608 if len(index) == full_length and index[0] == 0 and np.all(np.diff(index) == 1):
609 # TODO: This should be a real no-op, but the call stack is
610 # too deep to do this efficiently for now
611 chunk_tuples = product(*(range(len(c)) for i, c in enumerate(chunks)))
612 graph = {
613 (outname,) + c: Alias((outname,) + c, (inname,) + c)
614 for c in chunk_tuples
615 }
616 return tuple(chunks), graph
617
618 average_chunk_size = int(full_length / len(chunks[axis]))
619
620 indexer = []
621 index = asarray_safe(index, like=index)
622 for i in range(0, len(index), average_chunk_size):
623 indexer.append(index[i : i + average_chunk_size].tolist())
624
625 token = (
626 outname.split("-")[-1]
627 if "-" in outname
628 else tokenize(outname, chunks, index, axis)
629 )
630 chunks, graph = _shuffle(chunks, indexer, axis, inname, outname, token)
631 return chunks, graph
632 elif len(chunks[axis]) == 1:
633 slices = [slice(None)] * len(chunks)
634 slices[axis] = list(index)
635 slices = tuple(slices)
636 chunk_tuples = product(*(range(len(c)) for i, c in enumerate(chunks)))
637 dsk = {
638 (outname,)
639 + ct: Task((outname,) + ct, getitem, TaskRef((inname,) + ct), slices)

Callers 4

slice_wrap_listsFunction · 0.70
bag_rangeFunction · 0.50
safe_takeFunction · 0.50
create_merge_treeFunction · 0.50

Calls 11

AliasClass · 0.90
asarray_safeFunction · 0.90
_shuffleFunction · 0.90
TaskClass · 0.90
TaskRefClass · 0.90
diffMethod · 0.80
splitMethod · 0.80
sumFunction · 0.70
tokenizeFunction · 0.50
anyMethod · 0.45
allMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…