MCPcopy
hub / github.com/dask/dask / test_from_delayed

Function test_from_delayed

dask/dataframe/io/tests/test_io.py:647–675  ·  view source on GitHub ↗
()

Source from the content-addressed store, hash-verified

645
646
647def test_from_delayed():
648 df = pd.DataFrame(data=np.random.normal(size=(10, 4)), columns=list("abcd"))
649 parts = [df.iloc[:1], df.iloc[1:3], df.iloc[3:6], df.iloc[6:10]]
650 dfs = [delayed(parts.__getitem__)(i) for i in range(4)]
651 meta = dfs[0].compute()
652
653 my_len = lambda x: pd.Series([len(x)])
654
655 for divisions in [None, [0, 1, 3, 6, 10]]:
656 ddf = dd.from_delayed(dfs, meta=meta, divisions=divisions)
657 assert_eq(ddf, df)
658 assert list(ddf.map_partitions(my_len).compute()) == [1, 2, 3, 4]
659 assert ddf.known_divisions == (divisions is not None)
660
661 s = dd.from_delayed([d.a for d in dfs], meta=meta.a, divisions=divisions)
662 assert_eq(s, df.a)
663 assert list(s.map_partitions(my_len).compute()) == [1, 2, 3, 4]
664 assert ddf.known_divisions == (divisions is not None)
665
666 meta2 = [(c, "f8") for c in df.columns]
667 assert_eq(dd.from_delayed(dfs, meta=meta2), df)
668 assert_eq(dd.from_delayed([d.a for d in dfs], meta=("a", "f8")), df.a)
669
670 with pytest.raises(ValueError):
671 dd.from_delayed(dfs, meta=meta, divisions=[0, 1, 3, 6])
672
673 with pytest.raises(ValueError) as e:
674 dd.from_delayed(dfs, meta=meta.a).compute()
675 assert str(e.value).startswith("Metadata mismatch found in `from_delayed`")
676
677
678def test_from_delayed_to_dask_array():

Callers

nothing calls this directly

Calls 6

delayedFunction · 0.90
assert_eqFunction · 0.90
from_delayedMethod · 0.80
normalMethod · 0.45
computeMethod · 0.45
map_partitionsMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…