离散无限逻辑正态分布先验的隐马尔可夫模型

研究论文

116 浏览量更新于2024-08-28 收藏 594KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"这篇研究论文探讨了在隐马尔可夫模型(HMM)中使用离散无限逻辑正态分布(DILN)作为先验的方法。通过这种方法，模型能够处理无限状态支持，并且能够捕捉状态转移概率之间的相关性。论文提出了一个变分贝叶斯(VB)框架来推断DILN-HMM参数的后验分布，并通过合成数据和真实数据的实验验证了该模型在处理相关状态转移矩阵情况下的有效性。关键词包括：隐马尔可夫模型、层次贝叶斯建模、相关结构、变分贝叶斯方法。" 正文: 在信息技术领域，隐马尔可夫模型(Hidden Markov Model, HMM)是一种常用于处理序列数据的强大工具，尤其在自然语言处理、语音识别、生物信息学等领域有着广泛的应用。传统的HMM通常假定模型的状态数量是固定的，并且状态之间的转移概率独立。然而，在实际问题中，状态数量可能无法预知，而且状态转移概率往往存在相关性。本文提出的离散无限逻辑正态分布(DILN)为解决这些问题提供了一种新方法。DILN先验允许HMM模型支持无限数量的状态，这意味着模型可以灵活地适应不同复杂程度的数据。此外，通过引入DILN，模型可以捕捉到状态间转移概率的关联性，这对于处理那些状态间有依赖关系的问题尤为有用。变分贝叶斯框架被用来处理DILN-HMM中的参数推断问题。在变分贝叶斯方法中，目标是找到一个易于操作的概率分布，该分布与未知参数的真实后验分布尽可能接近。这种方法在处理高维度、复杂结构的贝叶斯模型时，能够有效地近似计算后验分布，避免了直接计算的困难。实验部分，作者利用合成数据和真实世界的数据集来验证DILN-HMM的效果。实验结果表明，与传统HMM相比，DILN-HMM在处理状态转移矩阵存在相关性的情况时表现更优，能够更准确地建模和预测序列数据的动态特性。这篇论文为HMM提供了一个新的视角，即通过DILN先验和变分贝叶斯推断来处理不确定状态数量和相关转移概率的问题。这不仅扩展了HMM的理论基础，也为实际应用中遇到的复杂序列数据建模提供了新的解决方案。未来的研究可能会进一步探索DILN-HMM在其他领域的应用，如时间序列分析、推荐系统或金融市场预测等。

资源详情

资源推荐

Hidden Markov Models with Discrete Inﬁnite

Logistic Normal Distribution Priors

Hao Zhu, Jinsong Hu

Department of Automation

Chongqing University of Posts and Telecommunications

Chongqing, P. R. China 400065

Email: haozhu1982@gmail.com

Henry Leung

Department of Electrical and Computer Engineering

University of Calgary

Calgary, Alberta, Canada T2N 1N4

Abstract—In this article, we propose a discrete inﬁnite logistic

normal distribution (DILN) to estimate the number of states in

a hidden Markov model (HMM). The HMM with the DILN

priors (DILN-HMM) allows for inﬁnite state support and model

correlations between state transition probabilities. A variational

Bayesian (VB) framework is proposed to infer the posterior

distribution of the parameters of DILN-HMM. Experiments

based on synthetic and real data show that the DILN-HMM

is effective in handling situations where state transition matrix

is correlated.

Index Terms—Hidden Markov Model (HMM), hierarchical

Bayesian modeling, correlation structure, variational Bayesian

(VB).

I. INTRODUCTION

Hidden Markov model (HMM) is a popular model for

sequential data, and is widely used in many ﬁelds including

speech recognition, machine vision, bioinformatics and ﬁnance

[1]–[5]. The HMM is usually trained by the maximum-

likelihood Baum-Welch algorithm [6], in which the number of

states is preset. If the number of states is not selected properly,

the parameters will be overestimated or underestimated, and

affect the generalization ability of the model.

Recently, the hierarchical Dirichlet Process (HDP) is applied

to HMM, and it called nonparametric HMM. It deﬁnes a

prior distribution on transition matrices over countably inﬁnite

state spaces [7]. The HDP-HMM leads to data-driven learning

algorithms which infer posterior distributions over the number

of states. But a lack of conjugacy between two levels of the

Dirichlet process means that there is a lack of a fast inference

algorithm. To tackle this issue, a stick-breaking HMM is

proposed to obtain a fully conjugate prior for an inﬁnite-

state HMM that has a variational solution [8]. The HDP-

based approach has been used in various applications [9]–[12].

One drawback is that it assumes that the state transitions are

independent. Hence, it does not have the capability to model a

correlated state transition matrix. For example, when HMM is

applied to speech recognition where each state corresponds to

a typical sound. Consider the English sound t, there are only

a few sounds that can follow: train, taste, top, etc. But the

sound s almost never comes after t. In other words, the state

corresponding to s should be negatively correlated with the

state t and so the probability of transitioning from t to s should

be small, but transitioning from t to r should be relatively

higher because there sounds are positively correlated.

To address this correlated issue, we propose applying the

discrete inﬁnite logistic normal distribution (DILN) to HMM,

which leads to an inﬁnite-state HMM and models the correla-

tions between state transition probabilities. The DILN is a new

Bayesian nonparametric prior for mixed-membership models.

The main idea behind DILN is that each component is located

in a latent space, and the correlation structure between them

is determined by the distance between their locations. The

DILN can be deﬁned as a scaled HDP, where the scaling is

determined by an exponentiated Gaussian process (GP) whose

kernel is a function of the latent distance matrix between

component locations [13].

The paper is organized as follows. In Section. II, a brief

review of the HMM is given. In Section. III, the proposed

HMM with DILN priors (DILN-HMM) is formulated and

the updating of posterior distributions of parameters in the

DILN-HMM based on variational Bayesian (VB) is presented.

Experimental results using synthetic and real data are given in

Section. IV to validate the performance of the proposed DILN-

HMM. Conclusions are given in Section. V.

II. H

IDDEN MARKOV MODEL

For a sequence of observations x =(x

, ..., x

),an

HMM assumes that the observation x

at time t is gener-

ated by an underlying, discrete state s

and that the state

sequence s =(s

, ..., s

) follows a ﬁrst-order Markov

process p(s

t−1

, ..., s

)=p(s

t−1

). The discrete case is

considered here, x

∈{1, 2, ..., M} and s

∈{1, 2, ..., I},

where M is the alphabet size and I is the number of states.

Therefore, an HMM can be described as θ = {A, B, π},

where A, B, π are deﬁned as follows

A = {a

},a

= p(s

t+1

= j |s

= i): state transition

probabilities

B = {b

},b

= p(x

= m |s

= i): emission probabili-

ties

π = {π

},π

= p(s

= i): initial state probabilities

For a given model parameter θ, the joint probability of

the observation sequence and the underlying state sequence

19th International Conference on Information Fusion

Heidelberg, Germany - July 5-8, 2016

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38693084

粉丝: 4
资源: 927

离散无限逻辑正态分布先验的隐马尔可夫模型

机器人应用的概率状态估计。贝叶斯滤波器包括卡尔曼、马尔可夫链；高斯、均匀和离散映射概率分布表示；分布抽样、边缘化、乘法.zip

马尔可夫模型、隐马尔可夫模型、层次化隐马尔可夫模型、马尔可夫网络

隐马尔可夫模型应用案例PPT

灰色隐马尔可夫模型python

隐马尔可夫模型在语音识别方面的优势有哪些

描述隐马尔可夫过程，隐马尔可夫模型

基于隐马尔可夫模型中文分词研究的课题意义

opencv隐马尔可夫模型 运动轨迹识别

利用隐马尔可夫模型进行中文语句的分词。

马尔可夫模型与隐式马尔可夫模型的区别

matlab隐马尔可夫模型

请以隐马尔可夫为例 举例说明

什么叫马尔可夫链？什么叫隐过程？什么叫隐马尔可夫过程？为什么说语音信号可以看成隐马尔可夫过程？隐马尔可夫模型有哪些模型参数？请叙述这些参数的含义和定义式。

在上述问题中，能否不使用任何包来实现隐马尔可夫模型

隐马尔可夫模型是什么

隐马尔可夫模型是什么？

隐马尔可夫模型 动作识别

写出GPS轨迹数据分析下用隐马尔可夫模型进行轨迹匹配的代码

hmm隐马尔可夫模型股票价格预测（python完整源码和数据）

隐马尔可夫模型matlab

最新资源

opencv隐马尔可夫模型运动轨迹识别

请以隐马尔可夫为例举例说明

隐马尔可夫模型动作识别