4608 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016
Enhanced Use of Mattes for
Easy Image Composition
Wencheng Wang, Member, IEEE, Panpan Xu, Xiaohui Bie, and Miao Hua
Abstract—Existing matting methods focus on improving matte
quality to produce high-quality composites. This generally
requires significant manual interaction, a tedious task for the
user. Despite these efforts, the composites may still exhibit evident
artifacts, especially in the case of transparent and complicated
objects as their related pixels always contain percentage of
the background. In this paper, we focus on the enhanced use
of mattes to produce satisfactory composites by suppressing
the discrepancies around objects of interest. This approach is
motivated by cloning methods but overcomes their shortcoming of
ineffective treatment of the over-included regions around objects
of interest. For this, we present an enhanced matting function
by including a term to smooth the local contrasts for seamless
composition, and meanwhile, we develop a novel algorithm to
generate mattes with reduced user interaction and improved
usability. As a result, we reduce the composite’s dependence
on the user’s input and only require the user to drag a box
to enclose the objects of interest. As shown in the user studies
and the experimental results, our method requires many times
less user interaction than the existing matting methods and
cloning methods. Our method is more effective in producing
good composites in a simple interactive manner, especially when
treating transparent and complicated objects, thereby providing
a superior approach for image composition.
Index Terms— Image generation, matting, cloning.
I. INTRODUCTION
M
ATTING methods are popular in image composition.
They involve pasting an object/region from a source
image, called a matte, into a target image and generating the
composite using the following matting function:
f
i
= (1 − α
i
) · t
i
+ α
i
· g
i
(1)
f
i
, t
i
and g
i
are the colors of the composited image, the target
image, and the source image, respectively. α
i
is the alpha value
Manuscript received September 1, 2015; revised April 5, 2016 and
June 1, 2016; accepted July 14, 2016. Date of publication July 19, 2016; date
of current version August 9, 2016. This work was supported in part by the
European Union’s Seventh Framework Programme (FP7/2007-2013) under
Grant 612627, in part by the Knowledge Innovation Program through the
Chinese Academy of Sciences, and in part by the National Natural
Science Foundation of China under Grant 61379087. The associate editor
coordinating the review of this manuscript and approving it for publication
was Dr. Ivana Tosic.
W. Wang, X. Bie, and M. Hua are with the State Key Laboratory of
Computer Science, Institute of Software, Chinese Academy of Sciences,
Beijing 100190, China (e-mail: whn@ios.ac.cn; xiaohui@ios.ac.cn;
huam@ios.ac.cn).
P. Xu is with the State Key Laboratory of Computer Science, Institute of
Software, Chinese Academy of Sciences, Beijing 100190, China, and also
with the University of Chinese Academy of Sciences, Beijing 100049, China
(e-mail: xupp@ios.ac.cn).
This paper has supplementary downloadable material available at
http://ieeexplore.ieee.org., provided by the author. The material includes
four files. The total size of the files is 7.84 MB. Contact whn@ios.ac.cn for
further questions about this work.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2016.2593345
at pixel i representing the probability of that pixel belonging
to the object of interest in the source image.
To achieve quality composition, existing methods apply
significant effort to produce suitable mattes with precise seg-
mentation of the objects of interest. However, segmentation is
not easy. Thus, existing methods require substantial user inter-
action, such as drawing many strokes on the objects. This is a
tedious task, especially when treating complicated or transpar-
ent objects. Furthermore, the resulting matte may still produce
artifacts in the composite owing to limitations of the matte in
representing the objects of interest [1]. All these preclude the
efficiency of image composition and limit its applications.
A. Overview
In this paper, rather than focusing on the matte quality
directly, we enhance the use of mattes to produce satis-
factory composites. This is much motivated by the cloning
methods [2]–[7] to achieve superior composition by seamlessly
merging the objects of interest into the target image, despite
the objects being represented approximately with imprecise
segmentation. Its foundation is the psychological observation
that the human visual system is much more sensitive to local
contrast than to absolute luminance or gradual changes in the
luminance [8], [9]. However, for cloning methods, it remains a
big challenge to treat the over-included regions (the regions
outside of the objects included in the source image patch),
especially the holes inside the objects. This is because that
cloning methods work by smoothing discrepancies along the
boundary of the selected image patch, not the borders of the
objects (see Section V). Thus, we try to enhance the use of
mattes by adding a melioration term to the matting function,
aiming at smoothing the discrepancies along the borders of
the objects. As a result, the objects can be individually merged
into the target image seamlessly, avoiding the problem of over-
included regions. This also leads to a reduction in required user
interaction in generating mattes.
In our current implementation, we only require the user
to drag a box to enclose the objects of interest. Afterwards,
we develop an algorithm to extract the objects reliably in
the box. In other words, when the objects are represented
without much loss, our enhanced matting function is able to
produce satisfactory composites. The algorithm entails three
steps: finding the objects using saliency detection, improving
object extraction using dense conditional random field (CRF)
classification, and enhancing object representation using edit
propagation. This is feasible to approximate the objects of
interest very well, owing to the fact that when the user outlines
a source image patch, the boundary of the box is always
outside the objects of interest and the objects are generally
1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.