广义关联向量机：一种稀疏贝叶斯分类与回归模型

194 浏览量更新于2024-08-28 收藏 665KB PDF 举报

"这篇研究论文探讨了广义关联向量机(Generalized Relevance Vector Machine, GRVM)，这是一种用于分类和普通回归的稀疏贝叶斯核机器学习方法。GRVM是基于广义线性模型(GLM)的概念，对传统的线性回归模型进行了自然扩展，并且采用相似的方法来估计参数。GRVM结合了GLM和关联向量机(RVM)的优点，具有统一的模型结构、相同的训练算法以及任务特定的模型设计便利性，同时保留了RVM的贝叶斯概率特性，即能够进行自动正则化和模型选择。" 正文: 在机器学习领域，关联向量机（Relevance Vector Machine, RVM）是一种以贝叶斯理论为基础的模型，它通过选择最相关的特征来实现模型的稀疏性，从而提高预测性能。RVM在处理高维数据时表现得尤为出色，因为它可以自动地进行特征选择，避免过拟合问题。而广义关联向量机（GRVM）则是RVM的一个扩展，它引入了更广泛的模型结构，适用于更复杂的数据模式。论文中指出，GRVM是广义线性模型（GLM）的衍生物。GLM是一类统计模型，包括泊松回归、逻辑回归等，它们在建模过程中考虑了响应变量与解释变量之间的非线性关系。GLM的核心在于它可以将不同类型的因变量与线性预测器之间的关系通过一个链接函数进行转换，从而适应各种分布类型。GRVM继承了GLM的这种灵活性，能够处理不同分布的输出，使其在各种机器学习任务中更具适用性。 GRVM的训练过程与GLM类似，都是基于最大似然估计和贝叶斯推断。在GRVM中，通过贝叶斯规则，模型参数被赋予先验分布，然后利用观测数据更新这些先验，得到后验分布。这个过程通常涉及到拉普拉斯近似，以求解复杂的后验分布。由于RVM的稀疏性，GRVM同样能够生成一个包含少量重要参数的模型，从而降低计算复杂性和提高解释性。此外，GRVM还保留了RVM的另一个关键优势——自动正则化。在贝叶斯框架下，模型复杂度是由先验分布控制的，这使得GRVM在训练过程中能够自然地平衡模型复杂度和预测性能，无需人工指定正则化参数。这在处理大规模数据集时特别有用，因为它减少了人为调整参数的需求。广义关联向量机GRVM是RVM的一次重要拓展，它融合了GLM的通用性和RVM的稀疏性，提供了一种强大而灵活的工具，适用于分类和回归任务，特别是在处理复杂非线性关系和高维数据时。通过其贝叶斯背景和自动正则化机制，GRVM在机器学习和数据分析中具有广泛的应用前景。

Intelligent Systems Conference 2017

7-8 September 2017 | London, UK

Generalized Relevance Vector Machine

Yuheng Jia

∗

, Sam Kwong

†

, Wenhui Wu

∗

, Wei Gao

∗

and Ran Wang

‡

∗

Department of Computer Science

City University of Hong Kong, Hong Kong, 3442–9704

Email: yhjia3-c@my.cityu.edu.hk; wenhuiwu3-c@my.cityu.edu.hk; weigao5-c@my.cityu.edu.hk

†

Department of Computer Science

City University of Hong Kong, Hong Kong, 3442–2907

Email: cssamk@cityu.edu.hk

‡

College of Mathematics and Statistics

Shenzhen University, Shenzhen 518060, China

Email: wangran@szu.edu.cn

Abstract—This paper considers the generalized version of rel-

evance vector machine (RVM), which is a sparse Bayesian kernel

machine for classiﬁcation and ordinary regression. Generalized

RVM (GRVM) follows the work of generalized linear model

(GLM), which is a natural generalization of ordinary linear

regression model and shares a common approach to estimate

the parameters. GRVM inherits the advantages of GLM, i.e.,

uniﬁed model structure, same training algorithm, and convenient

task-speciﬁc model design. It also inherits the advantages of

RVM, i.e., probabilistic output, extremely sparse solution, hyper-

parameter auto-estimation. Besides, GRVM extends RVM to a

wider range of learning tasks beyond classiﬁcation and ordinary

regression by assuming that the conditional output belongs to

exponential family distribution (EFD). Since EFD results in

inference intractable problem in Bayesian analysis, in this paper,

Laplace approximation is adopted to solve this problem, which is

a common approach in Bayesian inference. Further, several task-

speciﬁc models are designed based on GRVM including models

for ordinary regression, count data regression, classiﬁcation,

ordinal regression, etc. Besides, the relationship between GRVM

and traditional RVM models are discussed. Finally, experimental

results show the efﬁciency of the proposed GRVM model.

Keywords—Relevance vector machine; Generalized linear mod-

els; Laplace approximation; Bayesian analysis; Exponential family

distribution.

I. INTRODUCTION

Generalized linear models (GLM) [1] is a class of models

that is a natural generalization of ordinary linear regression

(OLR) model. GLMs include OLR, logistic regression, linear

count data regression, linear ordinal regression, etc. The “gen-

eralized” term in the title of this paper has the same meaning

as the “generalized” term in GLM.

From statistical perspective, the conditional output distri-

bution of OLR belongs to Gaussian distribution, i.e.,

p(y

∗

) = N(β

∗

, σ

) =

√

2πσ

exp(−

∗

−β

∗

)

(1)

where x ∈ R

is the input vector, y

∗

is the predictive output

for input vector x

∗

, β ∈ R

is the OLR model parameter,

is the variance of noise and N stands for Gaussian

distribution. GLM generalizes the OLR by modifying Gaussian

distribution to exponential family distribution (EFD) [2], [3],

[4], [5]. EFD is a class of distributions in the exponential form

including several common distributions, i.e., Gaussian distri-

bution, Poisson distribution, Bernoulli distribution, Binomial

distribution, Gamma distribution, etc. By specifying EFD to

certain distribution, a number of different linear models can be

obtained under GLM. For example, if the conditional output

distribution is Bernoulli distribution, then GLM turns out to

be logistic regression model and if the conditional output is

Poisson distribution, then GLM turns out to be linear count

data regression model. Considering those linear models from

a uniﬁed perspective, GLM has been found useful in statistical

analysis with several advantages:

• Uniﬁed model structure for each model in GLM,

i.e., output of each model is based on the linear

combination of input vector and model parameter,

such as the β

x term in Eq. (1).

• Same learning algorithm for all models in GLM,

which means that parameter for all the models under

GLM can be estimated by the same learning algo-

rithm. That shows the beauty of math in machine

learning.

• Due to the uniﬁed model structure and identical learn-

ing algorithm, the design for specify-task model is

very efﬁcient. For example, if the output data is count

data, linear Poisson regression model can be designed

to model the data.

Because of these advantages, GLM has attracted more and

more attentions. Generalized kernel machines (GKM) [2], as

the kernel version of GLM, was proposed to enhance the non-

linear modeling power of GLM. Bayesian generalized kernel

models (BGKM) [5] is the fully Bayesian extension of GLM

in feature space induced by a reproducing kernel. Generalized

Gaussian process models (GGPM) [3] is the generalized model

of Gaussian process (GP) [6], which encompasses many exist-

ing GP models. Since the inference in GGPM is intractable,

Taylor approximation was used for inference in [3]. Besides,

variational inference was also adopted to solve the inference

intractable problem in GGPM, which leads to a sparse solution

[7].

In this paper, a kind of generalized relevance vector ma-

chine (GRVM) is proposed. Relevance vector machine (RVM)

[8], [9] is a kind of sparse Bayesian kernel machine, which can

IEEE 1 | P a g e

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38655990

粉丝: 1
资源: 879

广义关联向量机：一种稀疏贝叶斯分类与回归模型

向量算子(梯度、散度、旋度)与拉普拉斯算符的公式与定义整理

线性系统理论：广义特征向量链解析

路由器中基于支持向量机(SVM)的异常检测方法研究.pdf

非线性系统手册原书第5版混沌，分形，元胞自动机，遗传算法，基因表达式编程，支持向量机，小波，隐马尔可夫模型，模糊逻辑与C 、JAVA和SymbolicC 程序

最小能量间隔向量值框架的研究

因果关系的广义相对论

广义图信号采样与重构

matlab开发-广义拓扑重叠度量

使用具有因子的广义线性模型 (GLM) 的通用函数：使用具有因子的广义线性模型 (GLM) 的通用函数-matlab开发

向量值函数的广义鞍点理论研究

最新资源