Function edge_split

python/dgl/distributed/dist_graph.py:1964–2059 · view source on GitHub ↗

Split edges and return a subset for the local rank. This function splits the input edges based on the partition book and returns a subset of edges for the local rank. This method is used for dividing workloads for distributed training. The input edges can be stored as a vector of m

(
    edges,
    partition_book=None,
    etype="_E",
    rank=None,
    force_even=True,
    edge_trainer_ids=None,
)

Source from the content-addressed store, hash-verified

1962
1963
1964	def edge_split(
1965	edges,
1966	partition_book=None,
1967	etype="_E",
1968	rank=None,
1969	force_even=True,
1970	edge_trainer_ids=None,
1971	):
1972	"""Split edges and return a subset for the local rank.
1973
1974	This function splits the input edges based on the partition book and
1975	returns a subset of edges for the local rank. This method is used for
1976	dividing workloads for distributed training.
1977
1978	The input edges can be stored as a vector of masks. The length of the vector is
1979	the same as the number of edges in a graph; 1 indicates that the edge in
1980	the corresponding location exists.
1981
1982	There are two strategies to split the edges. By default, it splits the edges
1983	in a way to maximize data locality. That is, all edges that belong to a process
1984	are returned. If ``force_even`` is set to true, the edges are split evenly so
1985	that each process gets almost the same number of edges.
1986
1987	When ``force_even`` is True, the data locality is still preserved if a graph is partitioned
1988	with Metis and the node/edge IDs are shuffled.
1989	In this case, majority of the nodes returned for a process are the ones that
1990	belong to the process. If node/edge IDs are not shuffled, data locality is not guaranteed.
1991
1992	Parameters
1993	----------
1994	edges : 1D tensor or DistTensor
1995	A boolean mask vector that indicates input edges.
1996	partition_book : GraphPartitionBook, optional
1997	The graph partition book
1998	etype : str or (str, str, str), optional
1999	The edge type of the input edges.
2000	rank : int, optional
2001	The rank of a process. If not given, the rank of the current process is used.
2002	force_even : bool, optional
2003	Force the edges are split evenly.
2004	edge_trainer_ids : 1D tensor or DistTensor, optional
2005	If not None, split the edges to the trainers on the same machine according to
2006	trainer IDs assigned to each edge. Otherwise, split randomly.
2007
2008	Returns
2009	-------
2010	1D-tensor
2011	The vector of edge IDs that belong to the rank.
2012	"""
2013	if not isinstance(edges, DistTensor):
2014	assert (
2015	partition_book is not None
2016	), "Regular tensor requires a partition book."
2017	elif partition_book is None:
2018	partition_book = edges.part_policy.partition_book
2019	assert len(edges) == partition_book._num_edges(
2020	etype
2021	), "The length of boolean mask vector should be the number of edges in the graph."

Callers 3

run_client_hierarchyFunction · 0.90

test_splitFunction · 0.90

test_split_evenFunction · 0.90

Calls 7

_split_even_to_partFunction · 0.85

_split_random_within_partFunction · 0.85

_split_by_trainer_idFunction · 0.85

_split_localFunction · 0.85

_num_edgesMethod · 0.80

num_partitionsMethod · 0.45

partid2eidsMethod · 0.45

Tested by 3

run_client_hierarchyFunction · 0.72

test_splitFunction · 0.72

test_split_evenFunction · 0.72