IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 11, NOV. 2006 4
domain knowledge for the problem. The main problems with
level set methods are difficulty of implementation (often re-
quiring specification of several free parameters) and difficulty
in fixing an incorrect solution, especially if the desired contour
does not correspond to a local energy minimum. Although the
early paper by Kass, Witkin and Terzopoulos [17] incorporated
user interaction, the active contours/level sets community ap-
pears to have trended away from this aspect. From a theoretical
standpoint, these methods are defined in the continuum and
achieve a local energy minimum, leading to difficulties in
trying to theoretically predict or understand the properties of
a practical solution.
The graph cuts [18], [19] technique has been developed
as a method for interactive, seeded, segmentation. As with
intelligent scissors, graph cuts views the image as a graph,
weighted to reflect intensity changes. A user marks some
nodes as foreground and others as background and the al-
gorithm performs a max-flow/min-cut analysis to find the
minimum-weight cut between the source and the sink. A
feature of this algorithm is that an arbitrary segmentation may
be obtained with enough user interaction and it generalizes
easily to 3D and beyond. However, although performing well
in many situations, there are a few concerns associated with
this technique. For example, since the algorithm returns the
smallest cut separating the seeds, the algorithm will often
return the cut that minimally separates the seeds from the
rest of the image, if a small number of seeds are used.
Therefore, a user may need to continue placing seeds in
order to overcome this “small cut” problem. Additionally,
the K-way graph cuts problem is NP-Hard, requiring use of
a heuristic to obtain a solution. Although one may find a
solution within a bound of the optimal multiway cut [20], the
problem becomes more difficult and one cannot be sure that
the optimal cut is achieved. Finally, multiple “smallest cuts”
may exist in the image that are quite different from each other.
Therefore, a small amount of noise (adjusting even a single
pixel) could cause the contour returned by the algorithm to
change drastically. Mathematically, we note that the present
algorithm may be considered as a relaxation of the binary
values of the potential function in graph cuts. Although this
may appear to constitute a minor modification of graph cuts,
in fact the motivation, theoretical properties, practical behavior
and method of solution are all quite different. The graph cuts
approach of [18] differs from the present work by including a
priors term on the intensity of the foreground and background
(with a consequent additional parameter). Although we will
not further discuss it here, such a modification to the random
walker algorithm may also be achieved [21].
The graph cuts segmentation algorithm has been extended
in two different directions in order to address issues of
speed, color images and the user interaction. The first type
of extension to the graph cuts algorithm has focused on speed
increases by coarsening the graph before applying the graph
cuts algorithm. This coarsening has been accomplished in two
manners: 1) By applying a standard multilevel approach and
solving subsequent, smaller graph cuts problems in a fixed
band to produce the final, full-resolution segmentation [22], 2)
By applying a watershed algorithm to the image and treating
each watershed basin as a “supernode” in a coarse graph
to which graph cuts in applied [23]. We note that the Lazy
Snapping approach of [23] additionally proposes interactive
tools for dividing watershed basins that may have incorrectly
merged the foreground and background regions. The primary
goal of these two approaches is to increase the computational
speed of graph cuts by intelligently reducing the number of
nodes in the graph. As stated in [22], the objective is to
produce the same segmentation result as regular graph cuts
by introducing a heuristic that greatly speeds the compu-
tation. Therefore, the benefits and difficulties of the graph
cuts algorithm listed above also apply to these approaches,
with an added uncertainty about the role of the coarsening
operator in the final result (i.e., the final segmentation is no
longer guaranteed to be the minimum cut). Additionally, both
approaches to increasing the computational speed of graph cuts
could equally be applied to the present algorithm with similar
computational gains.
The second direction of extension to the graph cuts algo-
rithm followed from the iterative estimation of a color model
with the graph cuts algorithm [24]. This iterative color model
was later coupled with an alteration of the user interface to
create the GrabCuts algorithm [25]. The GrabCuts approach
asks the user to draw a box around the object to be segmented
and employs the color model as priors (“t-links”) to obviate
the need for explicit specification of foreground seeds. The
added color model is of clear value in the application of
color image segmentation and the “box-interface” requires
less user interaction. Although the approach does perform
well in the domain of color image segmentation, the iterative
nature of the algorithm does increase the computational burden
of the algorithm (requiring a solution to the max-flow/min-
cut problem on each iteration) and there is no longer a
guarantee of optimality (the algorithm is terminated when
the iterations stagnate). For grayscale images, the GrabCuts
system essentially becomes standard graph cuts with a changed
user interface. However, it appears that the “box-interface” is
not always sufficient to capture the desired object, since further
editing of the results with standard graph cuts is often required.
As with the multilevel extensions described above, it would
be possible to merge the novel aspects of the GrabCuts system
(the iterative color image model and “box-interface”) with the
random walker algorithm described here. Since the graph cuts
algorithm of [18] forms the heart of the GrabCuts system, and
fulfils the same role as the present approach, we will focus on
the relative strengths and weaknesses of these two algorithms.
B. Graph-based methods of image segmentation
Early papers of Zahn [26] and Wu and Leahy [27] are
among the first approaches to apply graph theory to problems
in image analysis. However, recent interest largely appears
to have been spurred by Shi and Malik’s introduction of the
normalized cuts algorithm [14]. Most subsequent algorithms
have focused on the spectral properties of the graph (e.g.,
[28], [29]), although the isoperimetric algorithm [30] and the
Swendsen-Wang algorithm [31] are notable exceptions.