基于最大似然的Cryo-EM图像分类与FREALIGN软件

需积分: 10 46 浏览量更新于2024-09-09 1 收藏 2.63MB PDF 举报

"这篇资源是关于使用FREALIGN软件进行基于最大似然的电子显微镜（EM）图像分类，特别是在冷冻电镜（Cryo-EM）中的应用。该方法适用于大分子图像的分类和重构，涉及到概率模型、贝叶斯方法以及单粒子分析，对蛋白质结构的研究具有重要意义。" 在Cryo-EM技术中，科学家们通过观察冷冻的生物大分子来获取其三维结构。然而，由于样本的复杂性和图像质量的差异，需要对大量图像进行处理和分类，以便提取有用信息并重构出高分辨率的分子结构。这篇论文介绍了一种基于最大似然的图像分类方法，特别关注于Cryo-EM图像的分析。首先，最大似然是统计学中的一种基础概念，用于估计模型参数。在Cryo-EM图像分类中，这意味着通过对所有图像的对齐参数（如旋转和平移）进行优化，以最大化整体的数据似然性。这种方法考虑了图像之间的关联性和层次结构，允许在分类过程中引入先验知识，比如来自已知结构的约束。 FREALIGN软件是这个过程的核心工具，它能够有效地处理图像对齐和分类问题。通过对图像进行迭代对齐和分组，FREALIGN可以识别出相似的图像类别，从而帮助分离出不同的分子构象或者确定不同状态的分子比例。此外，论文还讨论了贝叶斯方法的应用。贝叶斯统计是一种利用先验信息更新后验概率的框架，这对于处理Cryo-EM数据特别有用，因为往往可以从现有知识中获得一些关于分子结构的先验信息。通过将这些先验知识纳入模型，可以提高分类的准确性和稳定性。关键词涵盖了电子显微镜、最大似然、分类、单粒子分析和蛋白质结构，这些是理解本文核心内容的关键领域。单粒子分析是指从大量的非对称图像中重建一个分子的三维结构，而蛋白质结构则是研究的重点，因为它是生命功能的基础。这篇论文提供了使用先进统计方法进行Cryo-EM图像分类的详细方法，有助于提升大分子结构解析的效率和精度，对于生物物理学和结构生物学的研究有着深远的影响。

展开

2.4. Nuisance parameters and hierarchical priors

In most scientiﬁc estimation problems, certain parameters are

of central interest while other parameters are only used as a means

to an end. In statistics these ‘‘uninteresting’’ parameters are called

nuisance parameters. Sigworth (Sigworth, 1998) treated the EM

transformation variables (e.g., the rotations and translations that

align images) as nuisance parameters, since they are only used

transiently to estimate the parameter of interest, the reference

structure. When SNR is low, the estimates of nuisance parameters

can be highly uncertain. Since ultimately we do not care about the

particular values of nuisance parameters, it would be useful to

somehow account for, and perhaps mitigate, the uncertainty in

their values.

A general statistical method for dealing with nuisance parame-

ters is to treat them as random variables with their own PDF. For

example, Sigworth recognized that the image transformation vari-

ables could be considered to be random variables themselves, and

he proposed a bivariate Gaussian distribution for the x,y coordinate

(translation) transformation variables:

pð/

;



Þ¼

exp 



ð9Þ

where f

;

g are the means and standard deviations of the x,y

coordinates. Model parameters for the Euler angles can also be

introduced if their distribution is non-uniform.

A PDF for parameters is called a prior. In this case Eq. (9) is specif-

ically referred to as a hierarchical prior, since we now have a statisti-

cal modelwith ahierarchy of distributions— aPDF for thedata, given

certain parameters, supplemented by a higher level PDF for some of

the parameters. The parameters of the hierarchical prior (e.g.,

and

in Eq. (9)) may be called hierarchical parameters, to distinguish

them from the parameters of the pure likelihood function.

Given a hierarchical prior for the /

parameters, the likelihoods

in Eqs. (2) and (4) can then be augmented to construct an extended

likelihood function pðX

; /

Þ by multiplying the normal likelihood

by the hierarchical prior:

pðX

; /

Þ¼pðX

H; /

Þpð/

;



Þð10Þ

ﬃﬃﬃﬃﬃﬃﬃ



exp 

 Pð/

; AÞk

pð/

;



ð11Þ

where the H ¼fA;

;

g is the augmented set of all model

parameters associated with reference structure A. Note that Eqs.

(10) and (11) correspond to 3D versions of Eqs. (11) and (12) of Sig-

worth, using our notation. The full hierarchical joint likelihood of a

set of images is thus:

pðX;/ HjÞ¼

ﬃﬃﬃﬃﬃﬃﬃ



M

exp 

Pð/

;AÞk

"# !

pð/

;



ð12Þ

with corresponding log-likelihood:

ln½pðX; / H

Þ ¼ 

lnð2

Þ

 Pð/

; AÞk

 M

ln½pð/

;



Þ:

ð13Þ

Other hierarchical priors can be added in a similar fashion to de-

scribe the distribution of other parameters, for example defocus

(Chen et al., 2009) and magniﬁcation. The form of the additional

distributions is often assumed to be Gaussian. Other authors may

refer to an extended likelihood as a regularized likelihood, penalized

likelihood, or a hierarchical likelihood. An extended likelihood as in

Eq. (11) is also a joint likelihood, as it is equivalent to the joint

PDF of the data and the parameters given the hyperparameters.

Given a hierarchical statistical model and a corresponding ex-

tended likelihood, there are several different ways to proceed with

parameter estimation. When the hyperparameters of the hierarchi-

cal prior distributions are estimated from the data using variants of

ML methodology, such techniques are referred to as extended like-

lihood or empirical Bayes. There are two main ML variants: (a) to

maximize the extended likelihood directly, and (b) to maximize

the marginal likelihood, in which the nuisance parameters have

been integrated out.

2.5. Maximization of the joint extended likelihood

The extended likelihood can be maximized over all unknown

parameters simultaneously, including both the parameters and

the hyperparameters in the optimization. In practice, this is usually

done using an iterative algorithm, in which each parameter is max-

imized in turn, conditional on the current optimal values of all

other parameters. However, any multi-parameter optimization

method may be used.

Maximization of the joint extended likelihood aims to ﬁnd the

joint point estimates of the ‘‘best’’ values for all parameters simul-

taneously. However, this method may not work well when the

hierarchical prior is diffuse or multimodal. Direct maximization

of the joint likelihood works best when the prior PDF for the hyper-

parameters is smooth and highly peaked.

2.6. Maximization of the marginal likelihood

Alternatively, when the nuisance parameters are highly uncer-

tain, it may be desirable to completely eliminate them from the

analysis, while taking into account the uncertainty in their values.

This is accomplished by integrating them out of the extended like-

lihood, resulting in a marginal likelihood function. For example, we

can eliminate /

from the extended likelihood function in Eqs. (10)

and (11) by integrating over its distribution:

pðX

Þ¼

pðX

H; /

Þpð/

;



Þd/

ð14Þ

which results in a marginal PDF that is independent of /

. This is the

approach taken by Sigworth (Sigworth, 1998), where he integrates

out the transformation parameters and maximizes the marginal

likelihood function over A and

In practice there are several choices for accomplishing the mar-

ginalization. In the simplest cases an analytical solution can be ob-

tained. Usually we are not so lucky and must resort to numerical

methods such as brute force integration, the Expectation–Maximi-

zation algorithm, or some combination of the two (Scheres, 2012a;

Sigworth, 1998).

2.7. Expectation–Maximization of the marginal likelihood

The Expectation–Maximization algorithm (normally abbrevi-

ated as EM, but we will avoid that here) ﬁnds the parameter values

that maximize the marginal distribution using a mathematical

trick that only requires the (non-integrated) joint likelihood. In

its most general form, the algorithm cycles between two steps:

(a) the ‘‘expectation step’’, in which one ﬁnds the expected loga-

rithm of the joint likelihood function, where the expectation is ta-

ken over the nuisance parameters (e.g., / in Eq. (11)), conditional

on the current values of the other parameters and the data, and

D. Lyumkis et al. / Journal of Structural Biology 183 (2013) 377–388

379

下载后可阅读完整内容，剩余11页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

iboy52008

粉丝: 0

基于最大似然的Cryo-EM图像分类与FREALIGN软件

EM图像分类方法：提升图像识别效果

EM算法实现彩色图像分割技术详解

3D神经突分割资源汇总：熵值法Matlab代码及EM图像处理研究

EM.zip_EM_图像em分类_图像分类

kGaussian_color_EM.zip_EM 图像_Em 图像分割_分割彩色图像_彩色 图像分割_彩色图像分割

EM图像分割：应用EM算法对不同类别的灰度图像进行分割-matlab开发

em图像分割程序，可以直接调用

Big_Connectome:Big Connectome 是一个全栈软件，用于使用深度学习，尤其是 caffe 来训练、测试和部署图像分类。 它包括一个基于 Web 的注释工具、一个图像处理服务和使用 caffe 训练 EM 补丁分类器的代码

遥感图像分类

论文研究-小波变换在三维EM图像复原中的应用.pdf

最新资源

kGaussian_color_EM.zip_EM 图像_Em 图像分割_分割彩色图像_彩色图像分割_彩色图像分割

Big_Connectome:Big Connectome 是一个全栈软件，用于使用深度学习，尤其是 caffe 来训练、测试和部署图像分类。它包括一个基于 Web 的注释工具、一个图像处理服务和使用 caffe 训练 EM 补丁分类器的代码