hub / github.com/dmlc/dgl / edge_label_informativeness

Function edge_label_informativeness

python/dgl/label_informativeness.py:22–110 · view source on GitHub ↗

r"""Label informativeness (:math:`\mathrm{LI}`) is a characteristic of labeled graphs proposed in the `Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond `__ Label informativeness shows how much informa

(graph, y, eps=1e-8)

Source from the content-addressed store, hash-verified

20
21
22	def edge_label_informativeness(graph, y, eps=1e-8):
23	r"""Label informativeness (:math:`\mathrm{LI}`) is a characteristic of
24	labeled graphs proposed in the `Characterizing Graph Datasets for Node
25	Classification: Homophily-Heterophily Dichotomy and Beyond
26	<https://arxiv.org/abs/2209.06177>`__
27
28	Label informativeness shows how much information about a node's label we
29	get from knowing its neighbor's label. Formally, assume that we sample an
30	edge :math:`(\xi,\eta) \in E`. The class labels of nodes :math:`\xi` and
31	:math:`\eta` are then random variables :math:`y_\xi` and :math:`y_\eta`.
32	We want to measure the amount of knowledge the label :math:`y_\eta` gives
33	for predicting :math:`y_\xi`. The entropy :math:`H(y_\xi)` measures the
34	`hardness' of predicting the label of :math:`\xi` without knowing
35	:math:`y_\eta`. Given :math:`y_\eta`, this value is reduced to the
36	conditional entropy :math:`H(y_\xi\|y_\eta)`. In other words, :math:`y_\eta`
37	reveals :math:`I(y_\xi,y_\eta) = H(y_\xi) - H(y_\xi\|y_\eta)` information
38	about the label. To make the obtained quantity comparable across different
39	datasets, label informativeness is defined as the normalized mutual
40	information of :math:`y_{\xi}` and :math:`y_{\eta}`:
41
42	.. math::
43	\mathrm{LI} = \frac{I(y_\xi,y_\eta)}{H(y_\xi)}
44
45	Depending on the distribution used for sampling an edge
46	:math:`(\xi, \eta)`, several variants of label informativeness can be
47	obtained. Two of them are particularly intuitive: in edge label
48	informativeness (:math:`\mathrm{LI}_{edge}`), edges are sampled uniformly
49	at random, and in node label informativeness (:math:`\mathrm{LI}_{node}`),
50	first a node is sampled uniformly at random and then an edge incident to it
51	is sampled uniformly at random. These two versions of label informativeness
52	differ in how they weight high/low-degree nodes. In edge label
53	informativeness, averaging is over the edges, thus high-degree nodes are
54	given more weight. In node label informativeness, averaging is over the
55	nodes, so all nodes are weighted equally.
56
57	This function computes edge label informativeness.
58
59	Parameters
60	----------
61	graph : DGLGraph
62	The graph.
63	y : torch.Tensor
64	The node labels, which is a tensor of shape (\|V\|).
65	eps : float, optional
66	A small constant for numerical stability. (default: 1e-8)
67
68	Returns
69	-------
70	float
71	The edge label informativeness value.
72
73	Examples
74	--------
75	>>> import dgl
76	>>> import torch
77
78	>>> graph = dgl.graph(([0, 1, 2, 2, 3, 4], [1, 2, 0, 3, 4, 5]))
79	>>> y = torch.tensor([0, 0, 0, 0, 1, 1])

Callers

nothing calls this directly

Calls 10

check_pytorchFunction · 0.70

to_bidirectedFunction · 0.50

toMethod · 0.45

cpuMethod · 0.45

floatMethod · 0.45

in_degreesMethod · 0.45

longMethod · 0.45

edgesMethod · 0.45

num_edgesMethod · 0.45

logMethod · 0.45

Tested by

no test coverage detected