Extracting salient region for pornographic image detection
Chenggang Clarence Yan
a
, Yizhi Liu
b,
⇑
, Hongtao Xie
c
, Zhuhua Liao
b
, Jian Yin
d
a
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
b
School of Computer Science and Engineering, Hunan University of Science and Technology, China
c
Institute of Information Engineering, Chinese Academy of Sciences, National Engineering Laboratory for Information Security Technologies, Beijing, China
d
Department of Computer, Shandong University, Weihai, China
article info
Article history:
Received 5 February 2014
Accepted 15 March 2014
Available online 3 April 2014
Keywords:
Salient region detection
Pornographic image detection
Visual attention analysis
Region-of-interest (ROI)
Skin-color model
Bag-of-visual-words (BoVW)
Codebook algorithm
Speed up robust features (SURF)
abstract
Content-based pornographic image detection, in which region-of-interest (ROI) plays an important role,
is effective to filter pornography. Traditionally, skin-color regions are extracted as ROI. However, skin-
color regions are always larger than the subareas containing pornographic parts, and the approach is dif-
ficult to differentiate between human skins and other objects with the skin-colors. In this paper, a novel
approach of extracting salient region is presented for pornographic image detection. At first, a novel sal-
iency map model is constructed. Then it is integrated with a skin-color model and a face detection model
to capture ROI in pornographic images. Next, a ROI-based codebook algorithm is proposed to enhance the
representative power of visual-words. Taking into account both the speed and the accuracy, we fuse
speed up robust features (SURF) with color moments (CM). Experimental results show that the precision
of our ROI extraction method averagely achieves 91.33%, more precisely than that of using the skin-color
model alone. Besides, the comparison with the state-of-the-art methods of pornographic image detection
shows that our approach is able to remarkably improve the performance.
Ó 2014 Elsevier Inc. All rights reserved.
1. Introduction
1.1. Background and motivation
With the rapid penetration of the Internet into every part of our
daily life, it is crucial to protect people, especially children, from
exposure to objectionable information. Content-based porno-
graphic image detection is one of the most powerful approaches
of filtering pornography. It can be classified into three kinds: glo-
bal-features based methods, the bag-of-visual-words (BoVW)
based approach, and the region-of-interest (ROI) based approach.
Global features are good at representing the overall characteris-
tics of an image. But using global features normally leads to inac-
curate content descriptions [1]. Recently, the bag-of-visual-words
(BoVW) based approach has been proved to be promising [2,3].
Nevertheless, the visual-words created from the whole images
are not representative enough because there are many background
noises in the whole images.
Moreover, extensive experimentation over the last few years
has shown that the ROI based approach is more accurate in
describing an image’s content than using the global features
[1,4]. In the field of pornographic image detection, ROI is usually
extracted by skin-color models. However, skin-regions are always
larger than the subareas containing pornographic parts, and the
approach is difficult to differentiate between human skins and
other objects with the skin-colors.
Visual attention is a mechanism which filters out redundant
visual information and detects the most relevant parts of our visual
field [5]. Attention is a general concept covering all factors that
influence selection mechanisms, whether they be scene-driven
bottom-up or expectation-driven top-down [6]. Therefore, we are
motivated to integrate visual attention models with skin-color
models, and to further devise a hybrid method which can combine
all the advantages of the preceding three kinds. The main technical
difficulties lie in three aspects:
(1) How to precisely capture the ROI of pornographic images?
Herein, ROI means the subareas of pornographic parts. Obvi-
ously, skin-color regions are usually larger than the regions
drawing the outline of erotic parts. Visual attention analysis
provides an alternative methodology to detect ROI. Once the
areas containing erotic parts are detected to be salient, an
image will be considered pornographic beyond all doubt.
(2) How to integrate ROI detection with BoVW representation?
Many years ago, some low-level features were extracted
from the ROI to detect pornographic images, such as ROIs’
http://dx.doi.org/10.1016/j.jvcir.2014.03.005
1047-3203/Ó 2014 Elsevier Inc. All rights reserved.
⇑
Corresponding author.
E-mail address: liuyizhi928@gmail.com (Y. Liu).
J. Vis. Commun. Image R. 25 (2014) 1130–1135
Contents lists available at ScienceDirect
J. Vis. Commun. Image R.
journal homepage: www.elsevier.com/locate/jvci