Polynomial-Time Algorithm for Finding Densest
Subgraphs in Uncertain Graphs
Zhaonian Zou
School of Computer Science and Technology
Harbin Institute of Technology, China
znzou@hit.edu.cn
ABSTRACT
This paper studies the problem of finding the densest sub-
graph in an uncertain graph. Due to uncertainty in graphs,
the traditional definitions of dense subgraphs are not ap-
plicable to uncertain graphs. In this paper, we introduce
the expected density of an uncertain graph. Based on the
expected density, we formalize the problem that, given an
uncertain graph G = (V, E, P ) and a set of vertices R ⊆ V ,
finds an induced subgraph G
′
= (V
′
, E
′
, P
′
) of G of the max-
imum expected density such that R ⊆ V
′
. We show that the
optimal solution can be found in O(nm log(n
2
/m)) time us-
ing max imum flow techniques, where n = |V | and m = |E|.
Moreover, unlike the existing models of uncertain graphs,
the model used in this p aper is very general, which doesn’t
assume the existence of edges is mutually independent.
Categories and Subject Descriptors
G.2.2 [Discrete Mathematics]: Graph Theory — Graph
algorithms; H.2.8 [Database Management]: Database Ap-
plications — Data mining
General Terms
Measurement, algorithms, performance
Keywords
Uncertain graph, marginal constraint, expected density, para-
metric maximum flow
1. INTRODUCTION
Dense subgraph discovery is a fundamental problem in
the research on graph databases. In literature, a number of
algorithms have been proposed for finding dense subgraph-
s in a given graph, where a variety of definitions of dense
subgraphs have been used, e.g., cliques [23], quasi-cliques
[1], k-cores [10], k-truss [24], and so on. In this paper, we
consider the density measure that assesses th e ratio of the
Permission to make digital or hard copies of all or part of this work for per-
sonal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than the author(s) must be honored. Abstract-
ing with credit is permitted. To copy otherwise, or republish, to post on
servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from Permissions@acm.org.
MLG’13, August 11, 2013, Chicago, Illinois, USA.
Copyright is held by the owner/author(s). Publication rights licensed to
ACM. ACM 978-1-4503-2322-2 ...$15.00..
v
4
v
5
v
1
v
3
v
2
0.7
0.8
0.8
0.2
0.91.0
0.1
0.7
Figure 1: Uncertain graph G.
number of edges to the number of vertices [12]. More precise-
ly, given a graph G = (V, E), the density of G is defined by
ρ(G) = |E|/|V |. This definition of density of graph G equiv-
alently measures the average d egree of G because 2|E|/|V |
is equal to the average degree of G. Based on this density
measure, many studies have been carried out on the prob-
lem of finding a subgraph or an induced subgraph of the
maximum density in a given graph [4, 8, 9, 12].
Recently, uncertainty has been recognized to be intrin-
sic in large graph databases due to errors of measurements,
delayed updates of data, and data integration. Managing
and mining uncertain graph data have attracted a lot of re-
search attentions [14, 15, 16, 17, 18, 20, 21, 25, 26, 27]. In
our prior work [27], we define an uncertain graph by a triple
G = (V, E, P ), where each edge e ∈ E has a probability
of P (e) to exist in practice. Due to uncertainty, the tradi-
tional defi nition of density ρ(G) of a graph G doesn’t make
sense on an uncertain graph. Consider the uncertain graph
G = (V, E, P ) in Figure 1, where the real number on each
edge e is P (e). If we t hink of G as an exact graph, the densi-
ty of G is 8/5. However, since edges (v
2
, v
4
) and (v
3
, v
5
) exist
with very low probability, the density of G should actually
be much lower than 8/5, and be close to 6/5.
In t his paper, we first formalize the problem of finding the
densest subgraph in an uncertain graph. A ccording to our
uncertain graph model, an uncertain graph G = (V, E, P )
exists as an exact graph G
′
= (V, E
′
) in practice, where
each edge e ∈ E exists in E
′
with probability P (e). More
formally, we say that G implicates G. Let Ω(G) be the set
of exact graphs implicated by G. The uncertain graph G es-
sentially represents a probability mass function p over Ω(G),
where p(G) is equal to the probability of G implicating G
for all G ∈ Ω(G). For each exact graph G = (V, E) ∈ Ω(G),
the density of G is ρ(G) = |E|/|V |. Therefore, we evaluate
the density of G by the expected value of density of an exact
graph G chosen at random from Ω(G) according to proba-
bility m ass function p. Namely, this measure is called the