MCPcopy
hub / github.com/karpathy/nanochat / repackage_data_reference.py

File repackage_data_reference.py

dev/repackage_data_reference.py:None–None  ·  view source on GitHub ↗

Source from the content-addressed store, hash-verified

1"""
2Repackage a given dataset into simple parquet shards:
3
4- each shard is ~100MB in size (after zstd compression)

Callers

nothing calls this directly

Calls 1

decodeMethod · 0.45

Tested by

no test coverage detected