请证明$H(\Theta \given \Theta^t)$满足以下性质: \begin{align*} \Theta^t = \operatorname{\arg \max}_{\Theta} H(\Theta \given \Theta^t). \end{align*} (提示: 使用Jensen不等式) 其中$H(\Theta \given \Theta^t) = \sum_{\bds{Z}} P(\bds{Z} \given \X, \Theta^t) \ln P(\bds{Z} \given \X, \Theta)$, $Q(\Theta \given \Theta^t)$的定义见课本(7.36)式
时间: 2023-07-05 07:30:35 浏览: 51
首先,根据Jensen不等式,对于一个凸函数$f$和一个随机变量$X$,有 $E[f(X)] \ge f(E[X])$。将$f(x) = \ln(x)$,$X$替换为$P(\bds{Z} \given \X, \Theta)$,则有: \begin{align*} H(\Theta \given \Theta^t) &= \sum_{\bds{Z}} P(\bds{Z} \given \X, \Theta^t) \ln P(\bds{Z} \given \X, \Theta) \\ &\ge \ln \left( \sum_{\bds{Z}} P(\bds{Z} \given \X, \Theta^t) P(\bds{Z} \given \X, \Theta) \right) \\ &= \ln Q(\Theta \given \Theta^t) \end{align*} 根据定义,$Q(\Theta \given \Theta^t)$是$H(\Theta \given \Theta^t)$的下界,因为它是对所有分布$P(\bds{Z} \given \X, \Theta)$的期望。所以我们有:$$\ln Q(\Theta \given \Theta^t) \le H(\Theta \given \Theta^t)$$ 因此,通过最大化$H(\Theta \given \Theta^t)$,我们可以获得$Q(\Theta \given \Theta^t)$的最大值。即, \begin{align*} \Theta^t &= \operatorname{\arg \max}_{\Theta} H(\Theta \given \Theta^t) \\ &\Rightarrow Q(\Theta^t \given \Theta^t) \le Q(\Theta \given \Theta^t) \end{align*}