Convolutional Neural Networks based Pornographic Image Classification
KaiLong Zhou,Li Zhuo,Zhen Geng,Jing Zhang,Xiao guang Li
Signal & Information Processing Laboratory
Beijing University of Technology
Beijing, China
zklkey@emails.bjut.edu.cn
Abstract—Considering the fact that pornographic images are
flooding on the web, we propose a pornographic image
recognition method based on convolutional neural network. This
method can be divided into two parts: coarse detection and fine
detection. Because majority of images are normal, we use coarse
detecting to quickly identify the normal images with no or fewer
skin-color regions and facial images. For the images which
contain much more skin-color regions, they need further
identification through fine detecting. At first, we trained the CNN
using the strategy of pre-training mid-level features non-fixed
fine-tuning, then based on the trained model, we can classify
whether the image is pornographic or not. Compared with
exiting methods, performance of our method is better than the
state-of-the-art.
Keywords: image classification; convolutional neural networks;
pornographic image recognition;
I. INTRODUCTION
With the rapid development of the internet, obscene and
pornographic images/videos can more easily spread and cause
greater harm to the social stability and teenagers’ mental health.
Thus, how to identify pornographic images or videos
automatically has become an important research subject of
purifying the internet environment and promoting the network
healthy development.
For the recognition of pornographic images on the internet,
the recognition accuracy and speed are the two important
subjects. Currently, a lot of methods for the network
pornographic images recognition have been proposed. The
mainstream methods are based on Content Based Image
Retrieval (CBIR) [1]. This method no longer needs the
participation of labor, but describes the contents of images by
extracting some visual features (such as color, texture, outline,
etc.), classification model can be obtained by training these
features. The pornographic image recognition technology
based on the content can be subdivided into three categories:
the first category is the rule based on the image, to estimate
whether it is pornographic according to the rule or model. Due
to the complexity of pornographic image, and the unfixed body
movement, it is very difficult to obtain the precisely result of
recognition. In [2], the author introduced a skin color model
which could filter the non-skin color area, and then according
to the threshold value, if the skin color area is greater than the
threshold, the image would be estimated as pornographic.
Though this is the easiest method, this method would receive
many wrong judgments.
The second category is based on the image retrieval
technology [3, 4]. This method constructs a image database
which containing a vast of the pornographic and normal
image firstly. The image to be recognized is used as the query
image, comparing with the images in the database; this image
is recognized as the same category with the most of the
retrieval results. But due to the variety of pornographic images,
it is difficult to build the image database.
The third category is to consider the pornographic image
recognition as the binary classification (non-pornographic or
pornographic) [5-7], It describes the content of pornographic
image through abstracting of low-level visual feature (such as
color, texture, outline, etc.), then they adopt the machine
learning method to get the classification model based on those
feature vector. Finally, the trained model can identify the
images. Though this method have achieved better results, the
choice of feature is difficult which need the professional staffs
with professional knowledge.
CNN is one of the artificial neural networks, because CNN
has a very good performance in some computer vision tasks,
and it currently has become the research hotspot of speech and
image recognition. CNN adopts depth network structure that
every level represents a feature, and the high level is the
abstract of low level feature, these hierarchical features can be
more effective, and it also avoids the difficulty of feature
choosing. But CNN has many parameters to been learned, large
amount of images are required, and the network structure
directly related to the abstract level and the feature dimension
need to be designed. This paper proposes a method to
recognize pornographic image based on the CNN. This method
is divided into coarse detection and fine detection. The coarse
detection can quickly recognize the normal images with no or
fewer skin color and human faces with the help of prior
knowledge; while the rest of images are identified through the
fine detection process. In fine detection process, we train CNN
using the strategy of the pre-training mid-level features non-
fixed fine-tuning, then make use of trained model to identify
whether the image is pornographic or not.
The work in this paper is supported by the National Natural Science
Foundation of China (No.61372149, No.61370189, No.61471013), the
Importation and Development of High
Caliber Talents Project of Beijing
Municipal Institutions (No.CIT&TCD20150311,No. CIT &TCD
201304036,CIT&TCD201404043), the Program for New Century Excellent
Talents in Universit
y(No.NCET-11-0892),the Speciali
zed Research Fund for
the Doctoral Program of Higher Education(No.20121103110017), the Natural
Science Foundation of Beijing (No.4142009), the Science and Technology
Development Program of Beijing Education Committee(No.
KM201410005002), Funding Project for Academic Human Resources
Development in Institutions of Higher Learning Under the Jurisdiction of
Beijing Municipality.
2016 IEEE Second International Conference on Multimedia Big Data
978-1-5090-2179-6/16 $31.00 © 2016 IEEE
DOI 10.1109/BigMM.2016.29
206
2016 IEEE Second International Conference on Multimedia Big Data
978-1-5090-2179-6/16 $31.00 © 2016 IEEE
DOI 10.1109/BigMM.2016.29
206