Multi-View Clustering via Joint Nonnegative Matrix Factorization
Jialu Liu
1
, Chi Wang
1
, Jing Gao
2
, and Jiawei Han
1
1
University of Illinois at Urbana-Champaign
2
University at Buffalo
Abstract
Many real-world datasets are comprised of different rep-
resentations or views which often provide information
complementary to each other. To integrate information
from multiple views in the unsupervised setting, multi-
view clustering algorithms have been develop ed to clus-
ter multiple views simultaneously to derive a solution
which uncovers the common latent structure shared by
multiple views. In this paper, we propose a novel NMF-
based multi-view clustering algorithm by searching for a
factorization that gives compatible clustering solutions
across multiple views. The key idea is to formulate a
joint matrix factorization process with the constraint
that pushes clustering solution of each view towards
a common consensus instead of fixing it directly. The
main challenge is how to keep clustering solutions across
different views meaningful and comparable. To tackle
this challenge, we design a novel and effective normaliza-
tion strategy inspired by the connection between NMF
and PLSA. Experimental results on synthetic and sev-
eral real datasets demonstrate the effectiveness of our
approach.
1 Introduction
Many datasets in real world are naturally comprised of
different representations or views [5]. For example, the
same story can be told in articles from different news
sources, one document may be translated into multiple
different languages, research communities are formed
based on research topics as well as co-authorship links,
web pages can be classified based on both content and
anchor text leading to hyperlinks, and so on. In these
applications, each data set is represented by attributes
that can naturally be split into different subsets, any
of which suffices for mining knowledge. Observing that
these multiple representations often provide compatible
and complementary information, it becomes natural
for one to integrate them together to obtain better
performance rather than relying on a single view. The
key of learning from multiple views (multi-view ) is to
leverage each view’s own knowledge base in order to
outperform simply concatenating views.
As unlabeled data are plentiful in real life and in-
creasing quantities of them come in multiple views from
diverse sources, the problem of unsupervised learning
from multiple views of unlabeled data has attracted
attention [3, 17], referred to as multi-view clustering.
The goal of multi-view clustering is to partition objects
into clusters based on multiple representations of the
object. Existing multi-view clustering algorithms can
be roughly classified into three categories. Algorithms
in the first category [3, 17] incorp orate multi-view inte-
gration into the clustering process directly through op-
timizing certain loss functions. In contrast, algorithms
in the second category such as the ones based on Canon-
ical Correlation Analysis [8, 4] first project multi-view
data into a common lower dimensional subspace and
then apply any clustering algorithm such as k-means to
learn the partition. The third category is called late in-
tegration or late fusion, in which a clustering solution
is derived from each individual view and then all the
solutions are fused base on consensus [7, 13].
In this paper, we propose a new multi-view cluster-
ing approach based on a highly effective technique in
single-view clustering, i.e., non-negative matrix factor-
ization (NMF) [18]. NMF, which was originally intro-
duced as a dimensionality reduction technique [18], has
been shown to be useful in many research areas such
as information retrieval [20] and pattern recognition
[18]. NMF has received much attention because of its
straightforward interpretability for applications, i.e., we
can explain each observation as an additive linear com-
binations of nonnegative basis vectors. Recently, NMF
has become a popular technique for data clustering, and
it is reported to achieve competitive performance com-
pared with most of the state-of-the-art unsupervised al-
gorithms. For example, Xu et al. [20] applied NMF to
text clustering and gained superior performance, and
Brunet et al. [6] achieved similar success on biological
data clustering. Recent studies [9, 11] show that NMF
is closely related to Probabilistic Latent Semantic Anal-