Function node_split

python/dgl/distributed/dist_graph.py:1865–1961 · view source on GitHub ↗

Split nodes and return a subset for the local rank. This function splits the input nodes based on the partition book and returns a subset of nodes for the local rank. This method is used for dividing workloads for distributed training. The input nodes are stored as a vector of mask

(
    nodes,
    partition_book=None,
    ntype="_N",
    rank=None,
    force_even=True,
    node_trainer_ids=None,
)

Source from the content-addressed store, hash-verified

1863
1864
1865	def node_split(
1866	nodes,
1867	partition_book=None,
1868	ntype="_N",
1869	rank=None,
1870	force_even=True,
1871	node_trainer_ids=None,
1872	):
1873	"""Split nodes and return a subset for the local rank.
1874
1875	This function splits the input nodes based on the partition book and
1876	returns a subset of nodes for the local rank. This method is used for
1877	dividing workloads for distributed training.
1878
1879	The input nodes are stored as a vector of masks. The length of the vector is
1880	the same as the number of nodes in a graph; 1 indicates that the vertex in
1881	the corresponding location exists.
1882
1883	There are two strategies to split the nodes. By default, it splits the nodes
1884	in a way to maximize data locality. That is, all nodes that belong to a process
1885	are returned. If ``force_even`` is set to true, the nodes are split evenly so
1886	that each process gets almost the same number of nodes.
1887
1888	When ``force_even`` is True, the data locality is still preserved if a graph is partitioned
1889	with Metis and the node/edge IDs are shuffled.
1890	In this case, majority of the nodes returned for a process are the ones that
1891	belong to the process. If node/edge IDs are not shuffled, data locality is not guaranteed.
1892
1893	Parameters
1894	----------
1895	nodes : 1D tensor or DistTensor
1896	A boolean mask vector that indicates input nodes.
1897	partition_book : GraphPartitionBook, optional
1898	The graph partition book
1899	ntype : str, optional
1900	The node type of the input nodes.
1901	rank : int, optional
1902	The rank of a process. If not given, the rank of the current process is used.
1903	force_even : bool, optional
1904	Force the nodes are split evenly.
1905	node_trainer_ids : 1D tensor or DistTensor, optional
1906	If not None, split the nodes to the trainers on the same machine according to
1907	trainer IDs assigned to each node. Otherwise, split randomly.
1908
1909	Returns
1910	-------
1911	1D-tensor
1912	The vector of node IDs that belong to the rank.
1913	"""
1914	if not isinstance(nodes, DistTensor):
1915	assert (
1916	partition_book is not None
1917	), "Regular tensor requires a partition book."
1918	elif partition_book is None:
1919	partition_book = nodes.part_policy.partition_book
1920
1921	assert len(nodes) == partition_book._num_nodes(
1922	ntype

Callers 5

run_client_hierarchyFunction · 0.90

check_dist_graphFunction · 0.90

check_dist_graph_heteroFunction · 0.90

test_splitFunction · 0.90

test_split_evenFunction · 0.90

Calls 7

_split_even_to_partFunction · 0.85

_split_random_within_partFunction · 0.85

_split_by_trainer_idFunction · 0.85

_split_localFunction · 0.85

_num_nodesMethod · 0.80

num_partitionsMethod · 0.45

partid2nidsMethod · 0.45

Tested by 5

run_client_hierarchyFunction · 0.72

check_dist_graphFunction · 0.72

check_dist_graph_heteroFunction · 0.72

test_splitFunction · 0.72

test_split_evenFunction · 0.72