Learning with Augmented Class by Exploiting Unlabeled Data
⇤
Qing Da Yang Yu Zhi-Hua Zhou
National Key Laboratory for Novel Software Technology
Nanjing University, Nanjing 210023, China
{daq, yuy, zhouzh}@lamda.nju.edu.cn
Abstract
In many real-world applications of learning, the envi-
ronment is open and changes gradually, which requires
the learning system to have the ability of detecting and
adapting to the changes. Class-incremental learning (C-
IL) is an important and practical problem where data
from unseen augmented classes are fed, but has not been
studied well in the past. In C-IL, the system should be-
ware of predicting instances from augmented classes as
a seen class, and thus faces the challenge that no such
instances were observed during training stage. In this
paper, we tackle the challenge by using unlabeled data,
which can be cheaply collected in many real-world ap-
plications. We propose the LACU framework as well
as the LACU-SVM approach to learn the concept of
seen classes while incorporating the structure presented
in the unlabeled data, so that the misclassification risks
among the seen classes as well as between the aug-
mented and the seen classes are minimized simultane-
ously. Experiments on diverse datasets show the effec-
tiveness of the proposed approach.
Introduction
Traditional machine learning approaches face many chal-
lenges raised in real-world applications, where the open
and dynamic environments break the stationary settings im-
plied in traditional approaches. A branch of methods dealing
with the changing environments is the incremental learn-
ing, which mainly includes sub-branches of the example-
incremental learning (E-IL) (Ruping 2001; Polikar et al.
2001; Fern and Givan 2003), the attribute-incremental learn-
ing (A-IL) (Vapnik, Vashist, and Pavlovitch 2009), the class-
incremental learning (C-IL) (Fink et al. 2006; Muhlbaier,
Topalis, and Polikar 2009; Kuzborskij, Orabona, and Caputo
2013) as concluded in (Zhou and Chen 2002). Among them,
C-IL is an important problem which is often encountered
in practice. For example, in building an image classification
system for pictures in the Internet, the user may only label a
few classes, say the dog, fish and bird. However, the system
has to predict images from wide classes in the future. When
⇤
This research was supported by the 973 Program
(2014CB340501), NSFC (61333014, 61375061) and Jiang-
suSF (BK2012303).
Copyright
c
2014, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
Figure 1: An illustration that unlabeled data helps the learn-
ing with augmented class problem.
an image of tiger comes, a traditional classification algo-
rithm will predict it in seen classes, like dog, which could
make the system unusable.
This paper investigates one of the core problems in C-
IL, i.e., how to recognize instances from unseen augmented
classes. An augmented class is a class which is unknown
during the training stage, but appears in the test stage. Once
the system can tell the augmented classes from the seen
ones, latter processing of the augmented classes can be han-
dled. Therefore, we would like the system to report an ex-
tra option to denote that an instance is from the augmented
class, with a high accuracy.
Specifically, the learning with augmented class (LAC)
problem, is given a training dataset D = {(x
i
,y
i
)}
L
i=1
,
where x
i
2 R
d
is an training instance and y
i
2 Y =
{1, 2,...,K} is the associated class label. Unlike the canoni-
cal classification, during test, we need to predict the class of
the instances from an open dataset D
o
= {x
i
,y
i
}
1
i=1
, where
y
i
2 Y
o
= {1, 2,...,K,K +1,...,M} with M>K. As
there are classes unobservable during the training time, the
goal of learning with augmented class is to learn a model
f(x):X ! Y
0
= {1, 2,...,K,novel}, where the option
novel indicates that x belongs to the augmented class, in
order to minimize following expected risk
f
⇤
= argmin
f2H
E
(x,y)⇠ D
o
err(y, f (x)), (1)
where H is a hypnosis space and err is LAC error
err(y, f (x)) =
⇢
I(f(x) 6= y),y2 Y
I(f(x) 6= novel),y/2 Y
(2)
Here I(expression) is an indicator function which equals 1
when the expression is true and 0 otherwise.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence