![](https://csdnimg.cn/release/download_crawler_static/2060185/bgc.jpg)
12 The structure and function of complex networks
where a path of length two refers to a directed path start-
ing from a specified vertex. This definition shows that C
is also the mean probability that the friend of your friend
is also your friend.
The definition of C given here has been widely used
in the sociology literature, where it is referred to as the
“fraction of transitive triples.”
11
In the mathematical
and physical literature it seems to have been first dis-
cussed by Barrat and Weigt [40].
An alternative definition of the clustering coefficient,
also widely used, has been given by Watts and Stro-
gatz [416], who proposed defining a local value
C
i
=
number of triangles connected to vertex i
number of triples centered on vertex i
. (5)
For vertices with degree 0 or 1, for which both numerator
and denominator are zero, we put C
i
= 0. Then the
clustering coefficient for the whole network is the average
C =
1
n
X
i
C
i
. (6)
This definition effectively reverses the order of the oper-
ations of taking the ratio of triangles to triples and of
averaging over vertices—one here calculates the mean of
the ratio, rather than the ratio of the means. It tends
to weight the contributions of low-degree vertices more
heavily, because such vertices have a small denominator
in Eq. (5) and hence can give quite different results from
Eq. (3). In Table II we give both measures for a number
of networks (denoted C
(1)
and C
(2)
in the table). Nor-
mally our first definition (3) is easier to calculate analyt-
ically, but (6) is easily calculated on a computer and has
found wide use in numerical studies and data analysis. It
is important when reading (or writing) literature in this
area to be clear about which definition of the clustering
coefficient is in use. The difference between the two is
illustrated in Fig. 5.
The local clustering C
i
above has been used quite
widely in its own right in the sociological literature,
where it is referred to as the “network density” [363].
Its dependence on the degree k
i
of the central ver-
tex i has been studied by Dorogovtsev et al. [113] and
Szab´o et al. [389]; both groups found that C
i
falls
off with k
i
approximately as k
−1
i
for certain models
of scale-free networks (Sec. III.C.1). Similar behavior
has also been observed empirically in real-world net-
works [349, 350, 397].
In general, regardless of which definition of the clus-
tering coefficient is used, the values tend to be consid-
erably higher than for a random graph with a similar
number of vertices and edges. Indeed, it is suspected
that for many types of networks the probability that the
11
For example, the standard network analysis program UCInet in-
cludes a function to calculate this quantity for any network.
friend of your friend is also your friend should tend to
a non-zero limit as the network becomes large, so that
C = O(1) as n → ∞.
12
On the random graph, by con-
trast, C = O(n
−1
) for large n (either definition of C)
and hence the real-world and random graph values can
be expected to differ by a factor of order n. This point
is discussed further in Sec. IV.A.
The clustering coefficient measures the density of tri-
angles in a network. An obvious generalization is to ask
about the density of longer loops also: loops of length
four and above. A number of authors have looked at such
higher order clustering coefficients [54, 79, 165, 172, 317],
although there is so far no clean theory, similar to a cu-
mulant expansion, that separates the independent contri-
butions of the various orders from one another. If more
than one edge is permitted between a pair of vertices,
then there is also a lower order clustering coefficient that
describes the density of loops of length two. This coeffi-
cient is particularly important in directed graphs where
the two edges in question can point in opposite directions.
The probability that two vertices in a directed network
point to each other is called the reciprocity and is often
measured in directed social networks [363, 409]. It has
been examined occasionally in other contexts too, such as
the World Wide Web [3, 137] and email networks [321].
C. Degree distributions
Recall that the degree of a vertex in a network is the
number of edges incident on (i.e., connected to) that
vertex. We define p
k
to be the fraction of vertices in
the network that have degree k. Equivalently, p
k
is the
probability that a vertex chosen uniformly at random
has degree k. A plot of p
k
for any given network can
be formed by making a histogram of the degrees of ver-
tices. This histogram is the degree distribution for the
network. In a random graph of the type studied by Erd˝os
and R´enyi [141–143], each edge is present or absent with
equal probability, and hence the degree distribution is,
as mentioned earlier, binomial, or Poisson in the limit of
large graph size. Real-world networks are mostly found
to be very unlike the random graph in their degree dis-
tributions. Far from having a Poisson distribution, the
degrees of the vertices in most networks are highly right-
skewed, meaning that their distribution has a long right
tail of values that are far above the mean.
Measuring this tail is somewhat tricky. Although in
theory one just has to construct a histogram of the de-
grees, in practice one rarely has enough measurements to
get good statistics in the tail, and direct histograms are
thus usually rather noisy (see the histograms in Refs. 74,
12
An exception is scale-free networks with C
i
∼ k
−1
i
, as described
above. For such networks Eq. (3) tends to zero as n → ∞,
although Eq. (6) is still finite.