BLIND PROPOSAL QUALITY ASSESSMENT VIA DEEP OBJECTNESS REPRESENTATION
AND LOCAL LINEAR REGRESSION
Qingbo Wu
1
, Hongliang Li
1
, Fanman Meng
1
, King N. Ngan
1,2
, and Linfeng Xu
1
1
University of Electronic Science and Technology of China
2
The Chinese University of Hong Kong
{qbwu,hlli,fmmeng,knngan,lfxu}@uestc.edu.cn
ABSTRACT
The quality of object proposal plays an important role in
boosting the performance of many computer vision tasks,
such as, object detection and recognition. Due to the ab-
sence of manually annotated bounding-box in practice, the
quality metric towards blind assessment of object proposal is
highly desirable for singling out the optimal proposals. In
this paper, we propose a blind proposal quality assessment
algorithm based on the Deep Objectness Representation and
Local Linear Regression (DORLLR). Inspired by the hierar-
chy model of the human vision system, a deep convolutional
neural network is developed to extract the objectness-aware
image feature. Then, the local linear regression method is uti-
lized to map the image feature to a quality score, which tries
to evaluate each individual test window based on its k-nearest-
neighbors. Experimental results on a large-scale IoU labeled
dataset verify that the proposed method significantly outper-
forms the state-of-the-art blind proposal evaluation metrics.
Index Terms— Blind proposal quality assessment, deep
objectness representation, local linear regression
1. INTRODUCTION
Following the increasing demands for fast object detection
system [1, 2, 3], the proposal algorithm has become an ac-
tive research field in recent years, which aims at generating
small amounts of candidate windows to avoid exhaustively
searching for massive sliding windows [4, 5]. The generated
proposals are preferred to cover the objects as tightly as pos-
sible, where the popular Intersection-over-Union (IoU) index
[6, 7, 8] is typically used to quantitatively measure the quality
of each window. As a full-reference metric, the IoU performs
well in evaluating the proposal quality when the manual an-
notations are given. However, in many automatic recognition
systems, the interactive information are unavailable, which
brings urgent demand for blind proposal quality assessment
(BPQA) to approach the IoU index.
This work was supported in part by National Natural Science Foundation
of China, under grant numbers 61601102, 61525102 and 61502084.
Existing proposal algorithms have made a great effort in
blindly estimating the proposal quality, which can be roughly
classified into two categories. The first class of methods
model the BPQA as a foreground/background segmentation
problem, where the foreground regions are considered to pos-
sess higher proposal quality than the backgrounds. In [9],
Sande et al. utilized the iterative superpixel merging to obtain
foreground regions, where the size and texture similarities are
two crucial clues to activate the mergence operation. The sim-
ilar idea is also employed in [10], and Man
´
en et al. proposed a
random sampling based maximum spanning tree algorithm to
accelerate the merging process. To improve the segmentation
accuracy of previous low-level feature based merging scheme,
Chang et al. [11] integrated two visual attention clues, i.e.,
saliency and objectness into a graph model. Then, the seg-
mentation was formulated as a energy function minimization
problem, which was solved by the alternative optimization.
It is worth nothing that these segmentation based methods
could only roughly identify the proposal quality in a binary
mode (i.e., foreground/background), which fairly limits their
capability in quantitatively evaluating the proposal. To cope
with this issue, the second class of BPQA methods pay more
attentions on developing kinds of window ranking or scoring
functions in terms of specific image cues. In [12], Rahtu et
al. proposed three objectness related features, i.e., the su-
perpixel boundary, boundary edge and window symmetry, to
feed a cascaded ranking model [13]. The similar idea could
be found in [14], and a non-maximal suppression strategy was
applied to remove the candidate windows significantly over-
lapped with the others. Endres et al. explored rich appearance
features in [15], and a diversity rewarding function was devel-
oped to rank the proposals. In addition to the ranking scheme,
many scoring models are also discussed in recent literatures.
In [16], Zitnick et al. scored a bounding box by using the sum
of edge strength within this box to subtract the edge strength
of the contours straddling the box’s boundary. In [4], Alexe et
al. computed the proposal quality by combining four object-
ness measures, i.e., multi-scale saliency, color contrast, edge
density and superpixels straddling, with a Bayesian model.
Cheng et al. proposed a computationally efficient feature,
978-1-5090-6067-2/17/$31.00
c
2017 IEEE
Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2017 10-14 July 2017
978-1-5090-6067-2/17/$31.00 ©2017 IEEE ICME 2017
1482