MCPcopy
hub / github.com/dask/dask / test_read_text

Function test_read_text

dask/bag/tests/test_text.py:47–86  ·  view source on GitHub ↗
(fmt, bs, encoding, include_path)

Source from the content-addressed store, hash-verified

45
46@pytest.mark.parametrize("fmt,bs,encoding,include_path", fmt_bs_enc_path)
47def test_read_text(fmt, bs, encoding, include_path):
48 if fmt not in utils.compress:
49 pytest.skip(f"compress function not provided for {fmt}")
50 compress = utils.compress[fmt]
51 files2 = {k: compress(v.encode(encoding)) for k, v in files.items()}
52 with filetexts(files2, mode="b"):
53 b = read_text(
54 ".test.accounts.*.json", compression=fmt, blocksize=bs, encoding=encoding
55 )
56 (L,) = compute(b)
57 assert "".join(L) == expected
58
59 o = read_text(
60 sorted(files),
61 compression=fmt,
62 blocksize=bs,
63 encoding=encoding,
64 include_path=include_path,
65 )
66 b = o.pluck(0) if include_path else o
67 (L,) = compute(b)
68 assert "".join(L) == expected
69 if include_path:
70 (paths,) = compute(o.pluck(1))
71 expected_paths = list(
72 concat([[k] * v.count("\n") for k, v in files.items()])
73 )
74 assert len(paths) == len(expected_paths)
75 for path, expected_path in zip(paths, expected_paths):
76 assert path.endswith(expected_path)
77
78 blocks = read_text(
79 ".test.accounts.*.json",
80 compression=fmt,
81 blocksize=bs,
82 encoding=encoding,
83 collection=False,
84 )
85 L = compute(*blocks)
86 assert "".join(line for block in L for line in block) == expected
87
88
89def test_read_text_unicode_no_collection(tmp_path):

Callers

nothing calls this directly

Calls 10

filetextsFunction · 0.90
read_textFunction · 0.90
computeFunction · 0.90
compressFunction · 0.85
skipMethod · 0.80
pluckMethod · 0.80
concatFunction · 0.50
itemsMethod · 0.45
joinMethod · 0.45
countMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…