Unsupervised Dimension Reduction Using Supervised Orthogonal Discriminant
Projection for Clustering
Leilei Yan
School of Computer Science and Technology
Soochow University
Suzhou, China
20184227032@stu.suda.edu.cn
Li Zhang
School of Computer Science and Technology
Soochow University
Suzhou, China
zhangliml@suda.edu.cn
Abstract—This paper proposes a novel unsupervised dimen-
sionality reduction method for clustering by combining super-
vised orthogonal discriminant projection (SODP) and K-means
selective clustering ensemble, called SODP-KSCE. The novel
algorithm, operating in an iterative manner, adaptively opti-
mizes the clustering results and learns a subspace with optimal
separation. To enhance the stability of K-means, SODP-KSCE
adopts ensemble learning. Moreover, a negentropy increment
(NI) index is introduced to measure the clustering performance.
The K-means clustering ensemble algorithm is performed in the
low-dimensional subspaces to generate pseudo class labels for
unlabeled data, which are then adopted to guide the dimension
reduction process of SODP in the original space. Experimental
results on multiple data sets indicate the effectiveness of SODP-
KSCE.
Keywords-dimensionality reduction; unsupervised learning;
clustering; supervised orthogonal discriminant projection; en-
semble learning;
I. INTRODUCTION
In many practical application domains, such as visual
category recognition, gene expression array analysis, and
image processing, a lot of high-dimensional data would be
produced. Especially for gene expression array analysis, the
dimensionality of data has reached thousands or even tens
of thousands. It is a very challenging issue to develop an
effective clustering method for high-dimensional data, due
to the curse of dimensionality [1].
For unlabeled high-dimensional data, the most common
way is to first use unsupervised dimensionality reduction
techniques and then perform the data clustering in gener-
ated low-dimensional subspaces. Typical unsupervised di-
mensionality reduction methods include principal compo-
nent analysis (PCA) [2], [3], isometric feature mapping
(ISOMAP) [4], local linear embedding (LLE) [5], Laplacian
eigenmap (LE) [6], locality preserving projection (LPP) [7],
[8], unsupervised discriminant projection (UDP) [9] and
others. However, subspace has a great influence on the
clustering results, so it is great importance to choose a
subspace with the best separability. In this subspace, the
transformed data points belonging to the same cluster are
close to each other, and belonging to different cluster are
farther apart.
An effective approach for finding the most discriminant
subspace is to directly connect dimension reduction with
clustering [10]–[12]. An iterative feature and data clustering
(IFD) was proposed by mutually reinforcing the relation-
ships between the coefficients and enabling a simultane-
ous clustering of both data and features [10]. Ding et
al. proposed adaptive dimension reduction for expectation
maximization (ADR-EM) and K-means (ADR-Km) methods
for clustering in low-dimension subspace [11]. Ding and
Li integrated linear discriminant analysis (LDA) and K-
means clustering in a joint framework to propose the LDA-
Km method [12]. LDA-Km uses first K-means to generate
pseudo class labels and then LDA to perform dimension
reduction until convergence. In doing so, the separability of
the clusters is maximized in the low-dimensional subspace.
However, we found that LDA-Km has two drawbacks. As
we all know, K-means is sensitive to initial center points.
Therefore, the final clustering result largely depends on the
selection of the initial center point, and the clustering result
is unstable. LDA-Km inherits this drawback of K-means.
The other drawback of LDA-Km is the issue of convergence.
Although the whole algorithm is a closed-loop system, there
is no direct relationship between the projection matrices
in the current and the previous iterations. It may happen
the situation of infinite loops without setting maximum
iterations.
The goal of this paper is to improve the performance
of LDA-Km and generate a novel unsupervised dimen-
sionality reduction method for clustering. There are three
notable points in LDA-Km. First, LDA could be replaced
by other supervised dimensionality reduction method since
LDA requires the data to follow the Gaussian distribution,
which is not always satisfied in the real data. Thus, other
alike methods could be considered, such as marginal Fisher
analysis (MFA) [13] and supervised orthogonal discriminant
projection (SODP) [14]–[16]. SODP methods have three
versions, orthogonal discriminant projection [14], supervised
orthogonal discriminant projection [15], and Supervised
2239
2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th
International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems
978-1-7281-2058-4/19/$31.00 ©2019 IEEE
DOI 10.1109/HPCC/SmartCity/DSS.2019.00311