深度学习：特征网络与标签网络并行优化

110 浏览量更新于2024-08-28 收藏 659KB PDF 举报

"这篇研究论文提出了一种名为Doubly Regularized Multi-Label learning (DRML)的方法，旨在解决监督学习中的两个主要难题：有限的训练样本和不完整的标签。通过同时利用特征网络和标签网络的正则化，DRML提高了模型在标签预测上的泛化性能。" 正文: 在机器学习领域，特别是监督学习，数据的有限性和标签的不完整性常常导致模型性能的退化。面对这样的挑战，研究人员不断探索新的方法来提高模型的泛化能力。"同时学习特征网络和标签网络"这篇研究论文提出了一个创新的解决方案——Doubly Regularized Multi-Label learning (DRML)算法。 DRML的核心思想是构建特征网络和标签网络，这两种网络分别用于处理数据特征集和标签集。首先，DRML利用边际化的线性去噪自编码器（marginalized linear denoising autoencoder）来构建这两个网络。去噪自编码器是一种常用的无监督学习技术，它通过添加随机噪声到输入数据，然后尝试重构原始输入，从而学习数据的潜在结构。在特征网络中，DRML通过对数据特征进行编码和解码，学习数据内部的关联性和结构信息。而在标签网络中，这种方法同样用于学习标签之间的依赖关系，这对于多标签分类问题尤其重要，因为一个样本可能属于多个类别，标签之间可能存在复杂的关联。然后，DRML算法将这两个网络的正则化项纳入到学习过程中，以约束模型的复杂度并减少过拟合的风险。通过同时优化特征网络和标签网络，DRML能够在保持模型解释性的同时，增强模型对新样本的泛化能力。 DRML的优势在于其双重正则化策略，这不仅有助于捕捉数据的复杂结构，还可以有效地利用有限的训练样本和不完整的标签信息。这种方法对于那些标签获取困难或者成本高昂的任务，如医学图像分类、文本情感分析等，具有显著的应用价值。 "同时学习特征网络和标签网络"这篇研究论文提供了一个新的视角，即通过特征和标签网络的联合正则化来提升多标签学习任务的性能。这一方法有望在实践中进一步推动监督学习模型的发展，并解决现实世界中数据不足和标签不完整的问题。

Learning with Feature Network and Label Network Simultaneously

Yingming Li,

†

Ming Yang,

†

Zenglin Xu,

‡

Zhongfei (Mark) Zhang

†

College of Information Science & Electronic Engineering, Zhejiang University, China

‡

School of Computer Science and Engineering, Big Data Research Center

University of Electronic Science and Technology of China

yingming@zju.edu.cn, cauchym@zju.edu.cn,

zenglin@gmail.com, zhongfei@zju.edu.cn

Abstract

For many supervised learning problems, limited training sam-

ples and incomplete labels are two difﬁcult challenges, which

usually lead to degenerated performance on label prediction.

To improve the generalization performance, in this paper, we

propose Doubly Regularized Multi-Label learning (DRML)

by exploiting feature network and label network regulariza-

tion simultaneously. In more details, the proposed algorithm

ﬁrst constructs a feature network and a label network with

marginalized linear denoising autoencoder in data feature set

and label set, respectively, and then learns a robust predictor

with the feature network and the label network regularization

simultaneously. While DRML is a general method for multi-

label learning, in the evaluations we focus on the speciﬁc ap-

plication of multi-label text tagging. Extensive evaluations on

three benchmark data sets demonstrate that DRML outstands

with a superior performance in comparison with some exist-

ing multi-label learning methods.

Introduction

With the research on tagging learning for decades (Nigam

et al. 1998; Elisseeff and Weston 2001; Yu, Yu, and Tresp

2005; Hsu et al. 2009; Liu and Tsang 2015), recent years

have witnessed the increasing applications of tagging learn-

ing in many ﬁelds ranging from social media searching

to classiﬁcation of medical reports due to its capability

of improving data organization and management. Conse-

quently, many tagging methods (Liu, Jin, and Yang 2006;

Zhang and Zhou 2007; 2014; Li, Yang, and Zhang 2016)

have been developed based on different requirements from

different areas. However, most existing tagging methods as-

sume that the amount of given training data is sufﬁcient

and the given training labels are complete. In contrast, for

many supervised learning problems, they often face two

challenges: limited training samples and incomplete train-

ing labels, which usually lead to degenerated performance

on label prediction.

Given a limited amount of labeled training data and a

very high-dimensional feature space, a common solution is

to regularize a model by penalizing a speciﬁc norm of its

parameters. The most commonly used norms in supervised

 2017, Association for the Advancement of Artiﬁcial

learning are L

and L

, which assume that model param-

eters are independent. However, dependencies between pa-

rameters usually exist in the real-world applications. For ex-

ample, in biomedical domain, gene features have structured

input since genes are organized as pathways; the learned

model parameters (feature weights for a linear classiﬁer)

should be more effective by keeping the structural relation-

ship between features. Further, dependencies can also be in-

ferred from data, e.g., manifold-based feature graph can be

used to regularize the model parameters and show its effec-

tivity (Li and Li 2008). However, the feature network based

on feature manifold only considers the positive correlation

between features and ignores negative correlations between

features. It is inappropriate since negative correlations also

help to reduce the search space of the model parameters.

On the other hand, recent work, for example (Chen,

Zheng, and Weinberger 2013), considers regularized learn-

ing with label network to mitigate the inﬂuence of incom-

plete training label set. It assumes that the given label set is

incomplete and proposes a label network based on marginal-

ized linear denoising autoencoder to exploit the relation-

ship among tags. Consequently, a label network regularized

learning method is presented to cope with the incomplete

tagging problem. The proposed method signiﬁcantly im-

proves over the prior state-of-the-art. However, it still suffers

from learning with limited training samples, which inﬂuence

the generalization performance.

To improve the generalization performance of tagging, it

is necessary to consider both feature network and label net-

work. To achieve this goal, we propose to train robust predic-

tors with feature network and label network simultaneously.

In particular, we ﬁrst learn a feature network and a label net-

work with marginalized linear denoising autoencoders on

feature set and label set, respectively. Take the learning of

feature network for example, we learn the feature network

by a marginalized linear denoising autoencoder, which is a

one-layer linear denoising neural network, and train a net-

work weight matrix B

to make B

x approximate x, where

x is a corrupted version of sample x ∈ R

by random

dropout corruption on each feature dimension. The learned

network weight B

indicates the relationship between fea-

ture i and feature j.

Further, we present the Doubly Regularized Multi-Label

learning (referred as DRML) model, which learns a robust

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

1410

下载后可阅读完整内容，剩余6页未读，立即下载

执念高

粉丝: 10
资源: 952

深度学习：特征网络与标签网络并行优化

feature extraction（深度学习，特征提取，神经网络）

基于深度特征学习的网络流量异常检测方法.pdf

深度神经网络同时进行特征学习和哈希编码

Tag2Vec：学习标签网络中的标签表示

基于径向基神经网络和正则化极限学习机的多标签学习模型.pdf

基于特征注意力和标签概率学习的文本分类模型.zip

机器学习 神经网络 多标签分类 短文本分类 LSTM.zip

网络游戏-基于图数据的主动学习多标签社交网络数据分析方法.zip

卷积神经网络的多标签学习研究.pdf

用于多标签学习的多任务深度神经网络

最新资源

机器学习神经网络多标签分类短文本分类 LSTM.zip