Function fuse

dask/optimization.py:458–888 · view source on GitHub ↗

Fuse tasks that form reductions; more advanced than ``fuse_linear`` This trades parallelism opportunities for faster scheduling by making tasks less granular. It can replace ``fuse_linear`` in optimization passes. This optimization applies to all reductions--tasks that have at most on

(
    dsk,
    keys=None,
    dependencies=None,
    ave_width=_default,
    max_width=_default,
    max_height=_default,
    max_depth_new_edges=_default,
    rename_keys=_default,
)

Source from the content-addressed store, hash-verified

456
457
458	def fuse(
459	dsk,
460	keys=None,
461	dependencies=None,
462	ave_width=_default,
463	max_width=_default,
464	max_height=_default,
465	max_depth_new_edges=_default,
466	rename_keys=_default,
467	):
468	"""Fuse tasks that form reductions; more advanced than ``fuse_linear``
469
470	This trades parallelism opportunities for faster scheduling by making tasks
471	less granular. It can replace ``fuse_linear`` in optimization passes.
472
473	This optimization applies to all reductions--tasks that have at most one
474	dependent--so it may be viewed as fusing "multiple input, single output"
475	groups of tasks into a single task. There are many parameters to fine
476	tune the behavior, which are described below. ``ave_width`` is the
477	natural parameter with which to compare parallelism to granularity, so
478	it should always be specified. Reasonable values for other parameters
479	will be determined using ``ave_width`` if necessary.
480
481	Parameters
482	----------
483	dsk: dict
484	dask graph
485	keys: list or set, optional
486	Keys that must remain in the returned dask graph
487	dependencies: dict, optional
488	{key: [list-of-keys]}. Must be a list to provide count of each key
489	This optional input often comes from ``cull``
490	ave_width: float (default 1)
491	Upper limit for ``width = num_nodes / height``, a good measure of
492	parallelizability.
493	dask.config key: ``optimization.fuse.ave-width``
494	max_width: int (default infinite)
495	Don't fuse if total width is greater than this.
496	dask.config key: ``optimization.fuse.max-width``
497	max_height: int or None (default None)
498	Don't fuse more than this many levels. Set to None to dynamically
499	adjust to ``1.5 + ave_width * log(ave_width + 1)``.
500	dask.config key: ``optimization.fuse.max-height``
501	max_depth_new_edges: int or None (default None)
502	Don't fuse if new dependencies are added after this many levels.
503	Set to None to dynamically adjust to ave_width * 1.5.
504	dask.config key: ``optimization.fuse.max-depth-new-edges``
505	rename_keys: bool or func, optional (default True)
506	Whether to rename the fused keys with ``default_fused_keys_renamer``
507	or not. Renaming fused keys can keep the graph more understandable
508	and comprehensive, but it comes at the cost of additional processing.
509	If False, then the top-most key will be used. For advanced usage, a
510	function to create the new name is also accepted.
511	dask.config key: ``optimization.fuse.rename-keys``
512
513	Returns
514	-------
515	dsk

Callers 15

fuse_rootsFunction · 0.90

getFunction · 0.90

fuse2Function · 0.90

test_fuseFunction · 0.90

test_fuse_keysFunction · 0.90

test_donot_substitute_same_key_multiple_timesFunction · 0.90

test_fuse_reductions_single_inputFunction · 0.90

test_fuse_stressedFunction · 0.90

test_fuse_reductions_multiple_inputFunction · 0.90

test_dont_fuse_numpy_arraysFunction · 0.90

test_fuse_configFunction · 0.90

test_fused_keys_max_lengthFunction · 0.90

Calls 14

flattenFunction · 0.90

get_dependenciesFunction · 0.90

subsFunction · 0.90

setClass · 0.85

anyFunction · 0.85

allFunction · 0.85

minFunction · 0.85

maxFunction · 0.85

removeMethod · 0.80

getMethod · 0.45

itemsMethod · 0.45

addMethod · 0.45

Tested by 13

fuse2Function · 0.72

test_fuseFunction · 0.72

test_fuse_keysFunction · 0.72

test_donot_substitute_same_key_multiple_timesFunction · 0.72

test_fuse_reductions_single_inputFunction · 0.72

test_fuse_stressedFunction · 0.72

test_fuse_reductions_multiple_inputFunction · 0.72

test_dont_fuse_numpy_arraysFunction · 0.72

test_fuse_configFunction · 0.72

test_fused_keys_max_lengthFunction · 0.72

test_fusion_legacy_hybridFunction · 0.72

test_fusion_wide_legacy_hybridFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…