We
shall see
that
the
three
problems
are
linked
together
tightly
under
our
probabilistic
framework.
11
1.
50LuTloNs
TO
THE
THREE
BASIC
PROBLEMS
OF
H
比
4
儿
h
A.
Solution
to
Prob/em
1
We
wish
to
calculate
the
probability
of
the
observation
sequence
, 0 = 0 , O
2
• • •
OT,
given
the
model
λ
,
i.
e.
,
P(OIλ).
The
most
straightforward
way
of
doing
this
is
through
enumerating
every
possible
state
sequence
of
length
T
(the
number
of
observations).
Consider
one
such
fixed
state
sequence
0=
q1
q2
… qT
where
q,
is
the
initial
state.
The
probability
of
the
obser-
vation
sequence
0
for
the
state
sequence
of
(12)
is
T
P(OIO
,
À)
=旦
P(O
肌
λ
山)
where
we
have
assumed
statistical
independence
of
obser-
vations.
Thus
we
get
P(OIO
,
λ)
= b
q
,(
01)
. b
q
,(
02)
…
bq
,(
OT)' (13b)
The
probability
of
such a state
sequence
0 can
be
written
as
P(OIλ
7r
q,a
q
,q,a
q
,
q3
… aqr_ ,qr' (14)
The
joint
probability
of
0
and
0 , i.e.,
the
probability
that
o
and
0
occur
simultaneously
,
is
simplythe
product
ofthe
above
two
terms
,
i.
e.
,
P(O
,
OIλ)
=
P(OIO
,
λ
)P(O
,
λ).
The
probability
of
0
(given
the
model)is
obtained
by
sum-
ming
this
joint
probabilityover
all
possible
state
sequences
q
glvmg
P(OIÀ)
=已
Pω10
,
λ)P(OIλ)
L:
7r
q
,
鸟
,
(0
,)
a
q
化
b
q
,
(02)
q"q2"" ,
qr
.
aqr_
,qrbq
,(
OT)'
The
interpretation
of
the
computation
in
the
above
equa-
tion
is
the
following.
Initially
(at
time
t = 1)
we
are
in
state
q,
with
probability
7r
q
"
and
generate
the
symbol
0 ,
(i
n
this
state)
with
probability
bq,
(O
,).
The
clock
changes
from
time
t
to
t + 1
(t
=
2)
and
we
make
a
transition
to
state
q2
from
state q,
with
probability
a
q
,
侣,
and
generate
symbol
O
2
with
probability
b
q
,(
02)'
This
process
continues
in
this
manner
until
we
make
the
list
transition
(at
time
T)
from
state
qT-1
to
state qT
with
probability
aqr_ ,
qr
and
generate
symbol
OT
with
probability
bq
,(
OT)'
A
little
thought
should
convince
the
reader
that
the
cal-
culation
of
P(OIλ)
,
according
to
its
direct
definition
(17)
involves
on
the
order
of
2 T . N
T
calculations
,
since
at
every
t = 1, 2, , . . , T,
there
are N
possible
states
which
can be
reached
(i.
e.
,
there
are N
T
possible
state sequences),
and
for
each such state seq
uence
about
2 T calcu
lations
are
required
for
each
term
in
the
sum
of
(17). (To
be
precise
,
we
need
(2T -
1)N
T
multiplications
,
and
N
T
- 1
additions.)
This
calculation
is
computationally
unfeasible
, even
for
small values
of
N
and
T;
e.g.,
for
N = 5 (states), T = 100
(observations),
there
are
on
the
order
of
2 . 100 .
5'00
"" 10
72
262
computations!
Clearly
a
more
efficient
procedure
is
required
to
solve
Problem
1.
Fortunately
such
a
procedure
exists
and
is
called
the
forward-backward
procedure.
The
Forward-8ackward
Procedure
[2],
[3
户"
Consider
the
forward
variable
α
t
(i)
defined
as
α
t
(i)
=
P(01
O
2
…
O
t/
qt
=
玩
|λ(18)
i.e.,
the
probability
of
the
partial
observation
sequence
, 0
1
O
2
' • •
Ot
,
(until
time
t)
andstateS;attime
t,
given
the
model
λ.
We
can solve
for
α
t
(i)
inductively
,
as
follows:
1)
Initialization:
(12)
α
,(i)
=霄
;b;(01)
,
1
s;
i
s;
N.
2)
Induction:
川叫主
αt
叫帆
+1)
,
s;三
T
- 1
1
s;
j
三
N.
(20)
(19)
3)
Termination:
N
P(OIλ)=
主
α
T
(i)
(21)
5tep
1)
initializes
the
forward
probabilities
as
the
joint
prob-
ability
of
state
5;
and
initial
observation
0
1
,
The
induction
step,
which
is
the
heart
of
the
forward
calculation
,
is
illus-
trated
in
Fig. 4(a).
This
figure
shows
how
state
Sj
can
be
51
5i
(15)
52
(a)
(16)
5N
Q,
(I)
1+1
Q'+I
(j)
(17)
T
N
歹
M
•
4
(b)
也
主
/\73
2
OB5ERVA
Tl
ON
, ,
Fig.4.
(a)
IlI ustration
of
the sequence of operations
required for the computation of the forward variable
a
,+
,
(j).
(b)
Implementation of the computation
of
叫(i)
in terms
of
a lattice
of
observations t, and
states
i.
6S
tr
ictly speaking,
we
only need the forward part
01
the forward-
backward procedure to solve Problem
1,
We
will introduce the
backward part
of
the procedure in this section since it will
be
used
to help solve Problem 3,
PROCEEDING5
OF
THE
IEEE
, vO
L.
77,
NO.
2,
FEBRUARY
1989