MCPcopy
hub / github.com/huggingface/datasets / test_write_batch

Function test_write_batch

tests/test_arrow_writer.py:154–166  ·  view source on GitHub ↗
(fields, writer_batch_size)

Source from the content-addressed store, hash-verified

152 "fields", [None, {"col_1": pa.string(), "col_2": pa.int64()}, {"col_1": pa.string(), "col_2": pa.int32()}]
153)
154def test_write_batch(fields, writer_batch_size):
155 output = pa.BufferOutputStream()
156 schema = pa.schema(fields) if fields else None
157 with ArrowWriter(stream=output, schema=schema, writer_batch_size=writer_batch_size) as writer:
158 writer.write_batch({"col_1": ["foo", "bar"], "col_2": [1, 2]})
159 writer.write_batch({"col_1": [], "col_2": []})
160 num_examples, num_bytes = writer.finalize()
161 assert num_examples == 2
162 assert num_bytes > 0
163 if not fields:
164 fields = {"col_1": pa.string(), "col_2": pa.int64()}
165 assert writer._schema == pa.schema(fields, metadata=writer._schema.metadata)
166 _check_output(output.getvalue(), expected_num_chunks=num_examples if writer_batch_size == 1 else 1)
167
168
169@pytest.mark.parametrize("writer_batch_size", [None, 1, 10])

Callers

nothing calls this directly

Calls 5

ArrowWriterClass · 0.90
_check_outputFunction · 0.85
write_batchMethod · 0.80
finalizeMethod · 0.80
schemaMethod · 0.45

Tested by

no test coverage detected