MCPcopy
hub / github.com/dmlc/dgl / edge_label_informativeness

Function edge_label_informativeness

python/dgl/label_informativeness.py:22–110  ·  view source on GitHub ↗

r"""Label informativeness (:math:`\mathrm{LI}`) is a characteristic of labeled graphs proposed in the `Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond `__ Label informativeness shows how much informa

(graph, y, eps=1e-8)

Source from the content-addressed store, hash-verified

20
21
22def edge_label_informativeness(graph, y, eps=1e-8):
23 r"""Label informativeness (:math:`\mathrm{LI}`) is a characteristic of
24 labeled graphs proposed in the `Characterizing Graph Datasets for Node
25 Classification: Homophily-Heterophily Dichotomy and Beyond
26 <https://arxiv.org/abs/2209.06177>`__
27
28 Label informativeness shows how much information about a node&#x27;s label we
29 get from knowing its neighbor&#x27;s label. Formally, assume that we sample an
30 edge :math:`(\xi,\eta) \in E`. The class labels of nodes :math:`\xi` and
31 :math:`\eta` are then random variables :math:`y_\xi` and :math:`y_\eta`.
32 We want to measure the amount of knowledge the label :math:`y_\eta` gives
33 for predicting :math:`y_\xi`. The entropy :math:`H(y_\xi)` measures the
34 `hardness&#x27; of predicting the label of :math:`\xi` without knowing
35 :math:`y_\eta`. Given :math:`y_\eta`, this value is reduced to the
36 conditional entropy :math:`H(y_\xi|y_\eta)`. In other words, :math:`y_\eta`
37 reveals :math:`I(y_\xi,y_\eta) = H(y_\xi) - H(y_\xi|y_\eta)` information
38 about the label. To make the obtained quantity comparable across different
39 datasets, label informativeness is defined as the normalized mutual
40 information of :math:`y_{\xi}` and :math:`y_{\eta}`:
41
42 .. math::
43 \mathrm{LI} = \frac{I(y_\xi,y_\eta)}{H(y_\xi)}
44
45 Depending on the distribution used for sampling an edge
46 :math:`(\xi, \eta)`, several variants of label informativeness can be
47 obtained. Two of them are particularly intuitive: in edge label
48 informativeness (:math:`\mathrm{LI}_{edge}`), edges are sampled uniformly
49 at random, and in node label informativeness (:math:`\mathrm{LI}_{node}`),
50 first a node is sampled uniformly at random and then an edge incident to it
51 is sampled uniformly at random. These two versions of label informativeness
52 differ in how they weight high/low-degree nodes. In edge label
53 informativeness, averaging is over the edges, thus high-degree nodes are
54 given more weight. In node label informativeness, averaging is over the
55 nodes, so all nodes are weighted equally.
56
57 This function computes edge label informativeness.
58
59 Parameters
60 ----------
61 graph : DGLGraph
62 The graph.
63 y : torch.Tensor
64 The node labels, which is a tensor of shape (|V|).
65 eps : float, optional
66 A small constant for numerical stability. (default: 1e-8)
67
68 Returns
69 -------
70 float
71 The edge label informativeness value.
72
73 Examples
74 --------
75 >>> import dgl
76 >>> import torch
77
78 >>> graph = dgl.graph(([0, 1, 2, 2, 3, 4], [1, 2, 0, 3, 4, 5]))
79 >>> y = torch.tensor([0, 0, 0, 0, 1, 1])

Callers

nothing calls this directly

Calls 10

check_pytorchFunction · 0.70
to_bidirectedFunction · 0.50
toMethod · 0.45
cpuMethod · 0.45
floatMethod · 0.45
in_degreesMethod · 0.45
longMethod · 0.45
edgesMethod · 0.45
num_edgesMethod · 0.45
logMethod · 0.45

Tested by

no test coverage detected