Function test_lance_read_basic

python/ray/data/tests/datasource/test_lance.py:47–97 · view source on GitHub ↗

(fs, data_path, batch_size)

Source from the content-addressed store, hash-verified

45	[None, 100],
46	)
47	def test_lance_read_basic(fs, data_path, batch_size):
48	df1 = pa.table({"one": [2, 1, 3, 4, 6, 5], "two": ["b", "a", "c", "e", "g", "f"]})
49	setup_data_path = _unwrap_protocol(data_path)
50	path = os.path.join(setup_data_path, "test.lance")
51	lance.write_dataset(df1, path)
52
53	ds_lance = lance.dataset(path)
54	assert ds_lance is not None
55	df2 = pa.table(
56	{
57	"one": [1, 2, 3, 4, 5, 6],
58	"three": [4, 5, 8, 9, 12, 13],
59	"four": ["u", "v", "w", "x", "y", "z"],
60	}
61	)
62	ds_lance.merge(df2, "one")
63
64	if batch_size is None:
65	ds = ray.data.read_lance(path)
66	else:
67	ds = ray.data.read_lance(path, scanner_options={"batch_size": batch_size})
68
69	# Test metadata-only ops.
70	assert ds.count() == 6
71	assert ds.schema() == Schema(
72	pa.schema(
73	{
74	"one": pa.int64(),
75	"two": pa.string(),
76	"three": pa.int64(),
77	"four": pa.string(),
78	}
79	)
80	)
81
82	# Test read.
83	values = [[s["one"], s["two"]] for s in ds.take_all()]
84	assert sorted(values) == [
85	[1, "a"],
86	[2, "b"],
87	[3, "c"],
88	[4, "e"],
89	[5, "f"],
90	[6, "g"],
91	]
92
93	# Test column projection.
94	ds = ray.data.read_lance(path, columns=["one"])
95	values = [s["one"] for s in ds.take_all()]
96	assert sorted(values) == [1, 2, 3, 4, 5, 6]
97	assert ds.schema().names == ["one"]
98
99
100	@pytest.mark.parametrize("data_path", [lazy_fixture("local_path")])

nothing calls this directly

_unwrap_protocolFunction · 0.90

SchemaClass · 0.90

tableMethod · 0.80

take_allMethod · 0.80

joinMethod · 0.45

mergeMethod · 0.45

countMethod · 0.45

schemaMethod · 0.45

no test coverage detected

searching dependent graphs…