MCPcopy Index your code
hub / github.com/dask/dask / test_split_every

Function test_split_every

dask/dataframe/tests/test_hyperloglog.py:78–84  ·  view source on GitHub ↗
(split_every, npartitions)

Source from the content-addressed store, hash-verified

76@pytest.mark.parametrize("split_every", [None, 2, 10])
77@pytest.mark.parametrize("npartitions", [2, 20])
78def test_split_every(split_every, npartitions):
79 df = pd.Series([1, 2, 3] * 1000)
80 ddf = dd.from_pandas(df, npartitions=npartitions)
81
82 approx = ddf.nunique_approx(split_every=split_every).compute(scheduler="sync")
83 exact = len(df.drop_duplicates())
84 assert abs(approx - exact) <= 2 or abs(approx - exact) / exact < 0.05
85
86
87def test_larger_data():

Callers

nothing calls this directly

Calls 3

drop_duplicatesMethod · 0.95
computeMethod · 0.45
nunique_approxMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…