J. Zhou et al. / Physica A 437 (2015) 263–277 265
ships among nodes than other three kinds of networks. Thus this paper attempts to propose a method to establish directed
weighted accident causation network (DWACN). To better explore the features of DWACN, the rest of this section will intro-
duce basic statistical properties of directed weighted complex network. These properties are used by us to analyze DWACN.
A directed weighted network G with N nodes can be represented mathematically by an N × N adjacency matrix A with
elements [20].
A
ij
=
a
ij
• w
ij
if node i points to node j,
0 otherwise
(1)
where a
ij
takes the value 1 if node i points to node j and 0 otherwise. And w
ij
specifies the weight on the edge if nodes i
points to node j (w
ij
= 0 otherwise).
(1) Degree and degree distribution.
The degree of k
i
of a node i is the number of edges incident with the node. In directed networks, the degree of the node has
two components: the number of outgoing links: k
out
i
=
j
a
ij
(referred to as the out-degree of the node), and the number of
incoming links: k
in
i
=
j
a
ji
(referred to as the in-degree of the node). The total degree is then defined as k
i
= k
out
i
+ k
in
i
[21].
The degree distribution completely determines the statistical properties of the network, and can be characterized by
P(k) [21]. Here, P(k) means the probability of a randomly selected node whose degree is k.
(2) Node strength and node strength distribution.
Degree has generally been extended to the sum of weights when analyzing weighted networks [22], and labeled node
strength. A very significative measure of the network properties in terms of the actual weights is obtained by looking at the
vertex strength as follows [23]: in a weighted network, the number of outgoing weight of node i : s
out
i
=
j
w
ij
, and the
number of incoming weight of node i : s
in
i
=
j
w
ji
. The total weight is then defined as s
i
= s
out
i
+ s
in
i
.
The node strength distribution is similarities with the degree distribution, here denoted as P(s). P(s) means the
probability of a randomly selected node whose degree is s.
(3) Shortest path length, average path length and diameter.
The shortest path length is the least number of edges that connects two nodes together [22]. In our network, the weights
of edges are very similar with ties strength [23]. The weights need to be reversed before directly applying Dijkstra’s algorithm
to identify the shortest paths in these networks. Newman’s [24] and Brandes’ [25] implementation of Dijkstra’s algorithm
can be formally defined as:
d
w
ij
= min
1
w
ih
+ · · · +
1
w
hj
. (2)
A measure of the typical separation between two nodes in the graph is given by the average shortest path length, defined
as the mean of geodesic length over all nodes [21]:
L =
1
N(N − 1)
i,j∈N, i=j
d
w
ij
(3)
where d
w
ij
is the length of the geodesic from node i to node j.
Diameter is defined as the longest of all the calculated shortest paths in a network [12].
(4) Weighted clustering coefficient.
In weighted networks, the weighted clustering coefficient is represented as c
w
i
, and the direction of edges should not be
considered. Thus for node i connecting to node j, the weight of edge is w
′
ij
(w
′
ij
= w
ij
+ w
ji
); and a
′
ij
= 1 (if a
ij
= 1 or a
ij
= 1).
In this way, the directed network turns into an undirected one. The weighted clustering coefficient is defined as [23]:
c
w
i
=
1
s
i
(k
i
− 1)
j,h
(w
′
ij
+ w
′
ih
)
2
a
′
ij
a
′
ih
a
′
jh
. (4)
In this way we are considering not just the number of closed triplets in the neighborhood of a vertex but also their total
relative weight with respect to the strength of the vertex.
(5) Betweenness centrality.
The communication of two non-adjacent nodes, say j and h, depends on the nodes belonging to the paths connecting j
and h. Consequently, a measure of the relevance of a given node can be obtained by counting the number of geodesics going
through it, and defining the so-called node betweenness centrality. The betweenness centrality b
i
of a node i is defined
as [21]:
b
i
=
j,h∈N,j=h
n
jh(i)
n
jh
(5)
where n
jh
is the number of shortest paths connecting node j and node h, while n
jh(i)
is the number of shortest paths connecting
node j and node h and passing through node i.