Multi-Task Multi-View Clustering for Non-Negative Data
Xianchao Zhang and Xiaotong Zhang and Han Liu
School of Software
Dalian University of Technology
Dalian 116620, China
xczhang@dlut.edu.cn, zxt.dut@hotmail.com, liu.han.dut@gmail.com
Abstract
Multi-task clustering and multi-view clustering
have severally found wide applications and re-
ceived much attention in recent years. Neverthe-
less, there are many clustering problems that in-
volve both multi-task clustering and multi-view
clustering, i.e., the tasks are closely related and
each task can be analyzed from multiple views. In
this paper, for non-negative data (e.g., documents),
we introduce a multi-task multi-view clustering
(MTMVC) framework which integrates within-
view-task clustering, multi-view relationship learn-
ing and multi-task relationship learning. We then
propose a specific algorithm to optimize the MT-
MVC framework. Experimental results show the
superiority of the proposed algorithm over either
multi-task clustering algorithms or multi-view clus-
tering algorithms for multi-task clustering of multi-
view data.
1 Introduction
Multi-task clustering improves individual clustering perfor-
mance by learning the relationship among related tasks.
Multi-view clustering makes use of the consistency among
different views to achieve better performance. Both multi-
task clustering and multi-view clustering have severally
found wide applications and received much attention in re-
cent years. Nevertheless, there are many practical problems
that involve both multi-task clustering and multi-view clus-
tering, i.e., the tasks are closely related and each task can
be analyzed from multiple views. For example, the tasks for
clustering the web pages from four universities are four re-
lated tasks. The four tasks all have word features in the main
texts, they also have many other features, such as the words
in the hyperlinks pointing to the web pages, and the words in
the titles of the web pages. For another example, the tasks for
clustering the web images collected from Chinese web sites
and English web sites are two related tasks. The two tasks
both have visual features in the images, they also have word
features in the surrounding texts in Chinese and English re-
spectively. To tackle the clustering problem of such data sets,
existing algorithms can only utilize limited information, i.e.,
multi-view clustering algorithms only use the information of
the views in a single task, multi-task clustering algorithms
only exploit the mutual information shared by all the related
tasks from a single view. However, we can get better per-
formance if both the multi-task and multi-view information
could be utilized.
Recently, multi-task multi-view learning algorithms, which
learn multiple related tasks with multi-view data, have been
proposed. The graph-based framework in
[
He and Lawrence,
2011
]
takes full advantages of both the feature heterogene-
ity and task heterogeneity. Within each task, the consistency
among different views is obtained by requiring them to pro-
duce the same classification function, and across different
tasks, the relationship is established by utilizing the similar-
ity constraint on the common views. The inductive learning
framework in
[
Zhang and Huan, 2012
]
uses co-regularization
and task relationship learning, which increases the practi-
cality of multi-task multi-view learning. These methods
have demonstrated their superiorities over either multi-task
or multi-view learning algorithms. However, they all tackle
classification. To the best of our knowledge, there is no exist-
ing approach to the multi-task multi-view clustering problem.
In this paper, we aim to deal with the multi-task multi-view
clustering of non-negative data, which arises in many appli-
cations, such as various types of documents. Based on the ob-
servation that the related tasks have both common views and
task specific views, we propose a bipartite graph based multi-
task multi-view clustering (MTMVC) framework, which con-
sists of three parts. (1) Within-view-task clustering: this part
clusters the data of each view in each task. It is the base
of the framework and mutually boosts with the other two
parts. (2) Multi-view relationship learning: this part uses
the consistency among different views to improve the clus-
tering performance. (3) Multi-task relationship learning: this
part learns the relationship among related tasks to improve
the clustering performance. We integrate the three parts into
one objective function and optimize it with a gradient as-
cent method. Because of the unitary constraints, we further
solve the optimization problem by mapping the variables to
the Stiefel manifold
[
Manton, 2002
]
. Experimental results on
several real data sets show the superiority of the proposed al-
gorithm over either multi-task clustering algorithms or multi-
view clustering algorithms for multi-task clustering of multi-
view data.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015)