Unsupervised dictionary learning with Fisher discriminant
for clustering
Mai Xu
a,
n
, Haoyu Dong
a
, Chen Chen
a
, Ling Li
b
a
Beihang Univeristy, Beijing 100191, China
b
University of Kent, UK
article info
Article history:
Received 10 September 2015
Received in revised form
8 December 2015
Accepted 28 January 2016
Communicated by Deng Cai
Available online 23 February 2016
Keywords:
Fisher discriminant
Dictionary learning
Sparse representation
Unsupervised learning
abstract
In this paper, we propose a novel Fisher discriminant unsupervised dictionary learning (FD-UDL)
approach, for improving the clustering performance of state-of-the-art dictionary learning approaches in
unsupervised scenarios. This is achieved by employing a novel Fisher discriminant criterion on dictionary
elements to encourage the diversity between different sub-dictionaries, and also the coherence within
each sub-dictionary. Such a discriminant is incorporated to formulate the optimization problem of
unsupervised dictionary learning. Furthermore, we provide an analytical solution to the proposed
optimization problem, obtaining the learned dictionary for clustering tasks. Unlike previous approaches
for unsupervised clustering, the proposed FD-UDL approach takes into account both within-class and
between-class scatters of sub-dictionaries, rather than only considering diversity between different sub-
dictionaries. Finally, experiments on synthetic data, face and handwritten digit clustering tasks show the
improved clustering accuracy over other state-of-the-art dictionary learning and clustering approaches.
& 2016 Elsevier B.V. All rights reserved.
1. Introduction
Dictionary learning [1] aims at utilizing machine learning
approaches to generalize the over-complete dictionary from a set of
training signals (e.g. image patches). The elements (also called
atoms) of such a learned over-complete dictionary can be seen as
basic texture patterns of generic signals. In the late 1990s, Olshausen
and Field [2] have shown that the sparse representation of texture
elements, from a learned dictionary, is very similar to simple-cell
receptive field in mammalian primary visual cortex. Thereby, an
image patch can be efficiently approximated using the linear com-
bination of a few elements (so called sparse representation) from an
over-complete dictionary, which is learned from training image
patches either offline [3] or online [4]. Together with sparse repre-
sentation, dictionary learning has been extensively applied in both
image processing and computer vision communities [1,5],suchasin
the tasks of super-resolution [6] and visual tracking [7].
1.1. Related work
At the beginning, reconstructive dictionaries, aiming at faith-
fully reconstructing signals, are learned for sparse representation
based image reconstruction tasks [3,8–11]. A representative
algorithm for effectively learning the reconstructive dictionary is
K-SVD [3]. Specifically, K-SVD iteratively alternates between sparse
representation and dictionary update steps. In the first step, K-SVD
utilizes the orthogonal matching pursuit (OMP) [12] to obtain the
sparse coefficients of all training signals. For the next step, each
dictionary element is updated sequentially by using singular value
decomposition (SVD) to minimize the error of reconstructing
relevant training signals. Such an update is similar to k-means
algorithm [13], hence named K-SVD.
Most recently, discriminative dictionary learning [14–19] has
emerged as a promising way to deal with the tasks of classification
and clustering in computer vision area. The approaches on dis-
criminative dictionary learning can be categorized into two clas-
ses: supervised and unsupervised forms. For supervised classifi-
cation, the discriminants need to be developed to enhance the
discriminative ability of the learned dictionary. For example,
Mairal et al. [15] proposed a softmax discriminative cost term for
multiple sub-dictionaries, each of which is relevant to one-class
training signals. In this way, the classification of an image patch
can be achieved by seeking the minimum reconstruction error of
these sub-dictionaries over the test image patch. Besides, Yang
et al. [16] proposed a Fisher discriminant on sparse coefficients to
learn a structured dictionary in supervised scenario, so that the
sparse coefficients have small within-class scatters but large
between-class scatters. Then, the test patches can be classified
using sparse representation with discriminative information in
both minimum reconstruction error and sparse coef
ficients. Gao
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/neucom
Neurocomputing
http://dx.doi.org/10.1016/j.neucom.2016.01.076
0925-2312/& 2016 Elsevier B.V. All rights reserved.
n
Corresponding author.
E-mail address: MaiXu@buaa.edu.cn (M. Xu).
Neurocomputing 194 (2016) 65–73