MCPcopy
hub / github.com/dmlc/dgl / adjusted_homophily

Function adjusted_homophily

python/dgl/homophily.py:197–269  ·  view source on GitHub ↗

r"""Homophily measure recommended in `Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond `__ Adjusted homophily is edge homophily adjusted for the expected number of edges connecting nodes with the same

(graph, y)

Source from the content-addressed store, hash-verified

195
196
197def adjusted_homophily(graph, y):
198 r"""Homophily measure recommended in `Characterizing Graph Datasets for
199 Node Classification: Homophily-Heterophily Dichotomy and Beyond
200 <https://arxiv.org/abs/2209.06177>`__
201
202 Adjusted homophily is edge homophily adjusted for the expected number of
203 edges connecting nodes with the same class label (taking into account the
204 number of classes, their sizes, and the distribution of node degrees among
205 them).
206
207 Mathematically it is defined as follows:
208
209 .. math::
210 \frac{h_{edge} - \sum_{k=1}^C \bar{p}(k)^2}
211 {1 - \sum_{k=1}^C \bar{p}(k)^2},
212
213 where :math:`h_{edge}` denotes edge homophily, :math:`C` denotes the
214 number of classes, and :math:`\bar{p}(\cdot)` is the empirical
215 degree-weighted distribution of classes:
216 :math:`\bar{p}(k) = \frac{\sum_{v\,:\,y_v = k} d(v)}{2|E|}`,
217 where :math:`d(v)` is the degree of node :math:`v`.
218
219 It has been shown that adjusted homophily satisifes more desirable
220 properties than other homophily measures, which makes it appropriate for
221 comparing the levels of homophily across datasets with different number
222 of classes, different class sizes, andd different degree distributions
223 among classes.
224
225 Adjusted homophily can be negative. If adjusted homophily is zero, then
226 the edge pattern in the graph is independent of node class labels. If it
227 is positive, then the nodes in the graph tend to connect to nodes of the
228 same class more often, and if it is negative, than the nodes in the graph
229 tend to connect to nodes of different classes more often (compared to the
230 null model where edges are independent of node class labels).
231
232 Parameters
233 ----------
234 graph : DGLGraph
235 The graph.
236 y : torch.Tensor
237 The node labels, which is a tensor of shape (|V|).
238
239 Returns
240 -------
241 float
242 The adjusted homophily value.
243
244 Examples
245 --------
246 >>> import dgl
247 >>> import torch
248
249 >>> graph = dgl.graph(([1, 2, 0, 4], [0, 1, 2, 3]))
250 >>> y = torch.tensor([0, 0, 0, 0, 1])
251 >>> dgl.adjusted_homophily(graph, y)
252 -0.1428571492433548
253 """
254 check_pytorch()

Callers

nothing calls this directly

Calls 8

edge_homophilyFunction · 0.85
check_pytorchFunction · 0.70
to_bidirectedFunction · 0.50
toMethod · 0.45
cpuMethod · 0.45
floatMethod · 0.45
in_degreesMethod · 0.45
num_edgesMethod · 0.45

Tested by

no test coverage detected