MCPcopy
hub / github.com/dmlc/dgl / chunk_graph

Function chunk_graph

tests/tools/pytest_utils.py:279–339  ·  view source on GitHub ↗

Split the graph into multiple chunks. A directory will be created at :attr:`output_path` with the metadata and chunked edge list as well as the node/edge data. Parameters ---------- g : DGLGraph The graph. name : str The name of the graph, to be used la

(
    g,
    name,
    ndata_paths,
    edata_paths,
    num_chunks,
    output_path,
    data_fmt="numpy",
    edges_fmt="csv",
    vector_rows=False,
    **kwargs,
)

Source from the content-addressed store, hash-verified

277
278
279def chunk_graph(
280 g,
281 name,
282 ndata_paths,
283 edata_paths,
284 num_chunks,
285 output_path,
286 data_fmt="numpy",
287 edges_fmt="csv",
288 vector_rows=False,
289 **kwargs,
290):
291 """
292 Split the graph into multiple chunks.
293
294 A directory will be created at :attr:`output_path` with the metadata and
295 chunked edge list as well as the node/edge data.
296
297 Parameters
298 ----------
299 g : DGLGraph
300 The graph.
301 name : str
302 The name of the graph, to be used later in DistDGL training.
303 ndata_paths : dict[str, pathlike] or dict[ntype, dict[str, pathlike]]
304 The dictionary of paths pointing to the corresponding numpy array file
305 for each node data key.
306 edata_paths : dict[etype, pathlike] or dict[etype, dict[str, pathlike]]
307 The dictionary of paths pointing to the corresponding numpy array file
308 for each edge data key. ``etype`` could be canonical or non-canonical.
309 num_chunks : int
310 The number of chunks
311 output_path : pathlike
312 The output directory saving the chunked graph.
313 data_fmt : str
314 Format of node/edge data: 'numpy' or 'parquet'.
315 edges_fmt : str
316 Format of edges files: 'csv' or 'parquet'.
317 vector_rows : str
318 When true will write parquet files as single-column vector row files.
319 kwargs : dict
320 Key word arguments to control chunk details.
321 """
322 for ntype, ndata in ndata_paths.items():
323 for key in ndata.keys():
324 ndata[key] = os.path.abspath(ndata[key])
325 for etype, edata in edata_paths.items():
326 for key in edata.keys():
327 edata[key] = os.path.abspath(edata[key])
328 with setdir(output_path):
329 _chunk_graph(
330 g,
331 name,
332 ndata_paths,
333 edata_paths,
334 num_chunks,
335 data_fmt,
336 edges_fmt,

Callers 1

create_chunked_datasetFunction · 0.70

Calls 4

setdirFunction · 0.90
_chunk_graphFunction · 0.70
itemsMethod · 0.45
keysMethod · 0.45

Tested by

no test coverage detected