Function _toposort

dask/core.py:362–452 · view source on GitHub ↗

(dsk, keys=None, returncycle=False, dependencies=None)

Source from the content-addressed store, hash-verified

360
361
362	def _toposort(dsk, keys=None, returncycle=False, dependencies=None):
363
364	# Stack-based depth-first search traversal. This is based on Tarjan's
365	# method for topological sorting (see wikipedia for pseudocode)
366	if keys is None:
367	keys = dsk
368	elif not isinstance(keys, list):
369	keys = [keys]
370	if not returncycle:
371	ordered = []
372
373	# Nodes whose descendents have been completely explored.
374	# These nodes are guaranteed to not be part of a cycle.
375	completed = set()
376
377	# All nodes that have been visited in the current traversal. Because
378	# we are doing depth-first search, going "deeper" should never result
379	# in visiting a node that has already been seen. The `seen` and
380	# `completed` sets are mutually exclusive; it is okay to visit a node
381	# that has already been added to `completed`.
382	seen = set()
383
384	if dependencies is None:
385
386	dependencies = DependenciesMapping(dsk)
387
388	for key in keys:
389	if key in completed:
390	continue
391	nodes = [key]
392	while nodes:
393	# Keep current node on the stack until all descendants are visited
394	cur = nodes[-1]
395	if cur in completed:
396	# Already fully traversed descendants of cur
397	nodes.pop()
398	continue
399	seen.add(cur)
400
401	# Add direct descendants of cur to nodes stack
402	next_nodes = []
403	for nxt in dependencies[cur]:
404	if nxt not in completed:
405	if nxt in seen:
406	# Cycle detected!
407	# Let's report only the nodes that directly participate in the cycle.
408	# We use `priorities` below to greedily construct a short cycle.
409	# Shorter cycles may exist.
410	priorities = {}
411	prev = nodes[-1]
412	# Give priority to nodes that were seen earlier.
413	while nodes[-1] != nxt:
414	priorities[nodes.pop()] = -len(priorities)
415	priorities[nxt] = -len(priorities)
416	# We're going to get the cycle by walking backwards along dependents,
417	# so calculate dependents only for the nodes in play.
418	inplay = set(priorities)
419	dependents = reverse_dict(

Callers 2

toposortFunction · 0.85

getcycleFunction · 0.85

Calls 8

DependenciesMappingClass · 0.90

setClass · 0.85

reverse_dictFunction · 0.85

minFunction · 0.85

popMethod · 0.80

removeMethod · 0.80

addMethod · 0.45

joinMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…