MCPcopy
hub / github.com/dmlc/dgl / node_split

Function node_split

python/dgl/distributed/dist_graph.py:1865–1961  ·  view source on GitHub ↗

Split nodes and return a subset for the local rank. This function splits the input nodes based on the partition book and returns a subset of nodes for the local rank. This method is used for dividing workloads for distributed training. The input nodes are stored as a vector of mask

(
    nodes,
    partition_book=None,
    ntype="_N",
    rank=None,
    force_even=True,
    node_trainer_ids=None,
)

Source from the content-addressed store, hash-verified

1863
1864
1865def node_split(
1866 nodes,
1867 partition_book=None,
1868 ntype="_N",
1869 rank=None,
1870 force_even=True,
1871 node_trainer_ids=None,
1872):
1873 """Split nodes and return a subset for the local rank.
1874
1875 This function splits the input nodes based on the partition book and
1876 returns a subset of nodes for the local rank. This method is used for
1877 dividing workloads for distributed training.
1878
1879 The input nodes are stored as a vector of masks. The length of the vector is
1880 the same as the number of nodes in a graph; 1 indicates that the vertex in
1881 the corresponding location exists.
1882
1883 There are two strategies to split the nodes. By default, it splits the nodes
1884 in a way to maximize data locality. That is, all nodes that belong to a process
1885 are returned. If ``force_even`` is set to true, the nodes are split evenly so
1886 that each process gets almost the same number of nodes.
1887
1888 When ``force_even`` is True, the data locality is still preserved if a graph is partitioned
1889 with Metis and the node/edge IDs are shuffled.
1890 In this case, majority of the nodes returned for a process are the ones that
1891 belong to the process. If node/edge IDs are not shuffled, data locality is not guaranteed.
1892
1893 Parameters
1894 ----------
1895 nodes : 1D tensor or DistTensor
1896 A boolean mask vector that indicates input nodes.
1897 partition_book : GraphPartitionBook, optional
1898 The graph partition book
1899 ntype : str, optional
1900 The node type of the input nodes.
1901 rank : int, optional
1902 The rank of a process. If not given, the rank of the current process is used.
1903 force_even : bool, optional
1904 Force the nodes are split evenly.
1905 node_trainer_ids : 1D tensor or DistTensor, optional
1906 If not None, split the nodes to the trainers on the same machine according to
1907 trainer IDs assigned to each node. Otherwise, split randomly.
1908
1909 Returns
1910 -------
1911 1D-tensor
1912 The vector of node IDs that belong to the rank.
1913 """
1914 if not isinstance(nodes, DistTensor):
1915 assert (
1916 partition_book is not None
1917 ), "Regular tensor requires a partition book."
1918 elif partition_book is None:
1919 partition_book = nodes.part_policy.partition_book
1920
1921 assert len(nodes) == partition_book._num_nodes(
1922 ntype

Callers 5

run_client_hierarchyFunction · 0.90
check_dist_graphFunction · 0.90
check_dist_graph_heteroFunction · 0.90
test_splitFunction · 0.90
test_split_evenFunction · 0.90

Calls 7

_split_even_to_partFunction · 0.85
_split_by_trainer_idFunction · 0.85
_split_localFunction · 0.85
_num_nodesMethod · 0.80
num_partitionsMethod · 0.45
partid2nidsMethod · 0.45

Tested by 5

run_client_hierarchyFunction · 0.72
check_dist_graphFunction · 0.72
check_dist_graph_heteroFunction · 0.72
test_splitFunction · 0.72
test_split_evenFunction · 0.72