The problem of core decomposition is: given a graph G,
find the k-core of G for k =0, 1,...,k
max
,wherek
max
is the
maximum core number of any vertex in G. Equivalently, the
problem is to find the k-class of G for k =0, 1,...,k
max
.In
this paper, we propose an external-memory algorithm that finds
the k-classes of G when main memory is not large enough to
hold the whole graph G.Fromthek-classes, we can easily
obtain any k-core as the induced subgraph of G by the vertex
set V
k
=
k≤i≤k
max
Ψ
i
.
The following example illustrates the concept of core de-
composition.
Example 1: Figure 1 shows a graph G that contains 14
vertices, a,...,n. The 0-class, Ψ
0
,ofG is an empty set
since there is no isolated vertex in G. The 1-class Ψ
1
=
{a, b, c, j}, the 2-class Ψ
2
= {d, e, f, g, m}, and the 3-class
Ψ
3
= {h, i, k, l, n}. In this example, we have k
max
=3.The
0-core is the induced subgraph of G by vertex set
0≤i≤3
Ψ
i
.
Since Ψ
0
= ∅ in this example, the 0-core is the same as the
1-core, which is simply the entire graph. The 2-core is the
induced subgraph by (Ψ
2
∪ Ψ
3
) and the 3-core is the induced
subgraph by Ψ
3
. One can easily verify that in the k-core,
every vertex is connected to at least k other vertices, where
0 ≤ k ≤ 3. 2
Fig. 1. A Graph and Its k-classes
III. IN-MEMORY CORE DECOMPOSITION
In this section, we describe the existing in-memory al-
gorithm for core decomposition and explain why it is not
suitable to be extended to an external-memory algorithm. We
also discuss the advantage of the top-down approach over the
bottom-up approach for core decomposition in large graphs.
Algorithm 1 describes the skeleton of the existing in-
memory algorithm for core decom position [20]. It is a bottom-
up approach as the algorithm starts the computation of the k-
class from smaller values of k and moves to larger values of
k.
The algorithm first sorts the vertices in ascending order of
their degree ( among the vertices with the same d egree, the
ordering can be arbitrary). Then, it starts to compute the d-
class Ψ
d
,whered is the minimum vertex degree in G.It
removes from G all vertices with degree of d, togeth er with
all the edges incident to them, and puts these vertices into Ψ
d
.
After rem oving these vertices and their edges, the degree of
some vertices that are previously connected with the removed
vertices decreases. If any vertices remaining in G now have
Algorithm 1 BottomUp
Input: G =(V, E).
Output:Thek-class, Ψ
k
,ofG for all k.
1. order the vertices in G in ascending order of their degree;
2. while (G is not empty)
3. let d be the minimum vertex degree in G;
4. Ψ
d
←∅;
5. while (there exists a vertex v with degree of at most d)
6. Ψ
d
← Ψ
d
∪{v};
7. remove v and all edges incident to v from G;
8. re-order the remaining vertices in G
in ascending order of their degree;
9. output all Ψ
k
;
degree of d or less, they cannot be in a k-class where k>d,
and thus must be in Ψ
d
. Therefore, they are added to Ψ
d
and removed from G. This process continues until all vertices
remaining in G have degree greater than d. Then, the algorithm
moves on to the next iteration to compute the next k-class
where k>d. The algorithm terminates when all vertices, and
their edges, are removed from G.
The following example further explains the bottom-up com-
putation.
Example 2: Suppose that we compute the k-classes of the
graph in Figure 1 by Algorithm 1. Line 1 sorts the vertices
as follows: {a(1), c(1), j(1), f (2), m(2), b(3), d(3), g(3),
e(4), k(4), n(4), h(5), l(5), i(6)}, where the number in the
parentheses indicates the degree of the vertex in the current
G. Then, in the first iteration (Lines 3-8) that computes the 1-
class, the vertices a, c and j are first removed from G together
with their incident edges since their degree is 1. After that, the
vertices are re-ordered as {b(1), f (2), m(2), d(3), g(3), e(4),
k
(4), n(4), h(5), l(5), i(5)}.Thenvertexb is also removed
because after removing a and c, the degree of b becomes 1. At
this point, the ordered vertex list becomes {f (2), m(2), d(2),
g(3), e(4), k(4), n(4), h(5), l(5), i(5)}. The first iteration
completes and obtains Ψ
1
= {a, c, j, b}. In the second iteration
that computes the 2-class, the vertices removed are (in the
order): f , m, d, g,ande. In the third iteration that computes
the 3-class, the vertices removed are (in the order): n, k, h, l,
and i. 2
When main memory is sufficient to hold the entire input
graph, the most costly step of the in-memory algorithm is
sorting the vertices according to their degree at each iteration
(Line 8). Batagelj and Zaversnik [20] use bin-sort to order the
vertices to achieve O(m + n) time complexity.
When the graph is too large to be placed in main memory,
the bottom-up approach fails because it requires memory space
of Ω(m + n). The bottom-up approach is not suitable for
designing an efficient external-memory algorithm since it starts
core deco mposition over the en tire g raph to compute the k-
class with the smallest k. Although sorting can be done in
O(
n
B
log M
B
n
B
) I/Os (assuming that we store separately the
53