基于MOEPGA的酵母PPI网络蛋白质复合物检测新方法：融合多目标优化提升准确性

需积分: 9 166 浏览量更新于2024-08-12 1 收藏 1.56MB PDF 举报

MOEPGA: 一种创新的检测策略，专用于在酵母蛋白质-蛋白质相互作用网络（PPI Network）中识别具有生物学意义的蛋白质复合物。传统的单目标方法往往依赖于PPI网络的特定拓扑特性，但这种方法可能不足以充分解决蛋白质复合物鉴定的复杂性。为此，研究人员提出了一种多目标进化规划遗传算法（Multiobjective Evolutionary Programming Genetic Algorithm, MOEPGA），它通过综合运用多种网络拓扑特征来增强识别能力。 MOEPGA方法首先将蛋白质复合物的检测问题视为一个多目标优化问题，这涉及到网络中多个关键属性的考虑，如连接度、紧密度和聚类系数等。目标函数的设计基于蛋白质复合物在基准数据集中的典型拓扑特性，旨在捕捉这些复合物在网络中的显著特征。算法的主要步骤包括： 1. **总体初始化**：通过随机生成初始的蛋白质子图集合，这些子图代表可能的蛋白质复合物候选。 2. **子图变异**：在每一代中，算法会通过随机操作（如节点交换或添加/删除）对子图进行变异，以探索不同组合的可能性。 3. **子图选择**：基于适应度评估，算法会选择那些满足目标函数值的子图，即在多个目标（如聚类得分、连接度等）之间取得平衡的蛋白质子图。通过对比实验，MOEPGA在两个酵母PPI数据集上展示了其优越性。相较于其他最新算法，MOEPGA不仅能够发现更多的蛋白质复合物，而且在F-score指标上具有更高的准确性和稳定性。此外，它在归一化聚类得分上表现出良好的覆盖范围，这意味着它能识别出网络中更多类型的蛋白质参与复合物。 MOEPGA作为一种强大的工具，对于深入理解酵母PPI网络中蛋白质复合物的功能及其生物学意义具有重要意义。其多目标优化策略和集成多种网络拓扑特征的做法，为未来在更广泛的生物网络分析领域提供了新的思路和技术支持。通过这种方法，研究人员能够更有效地揭示生物体内复杂系统的内在结构和功能关联。

Computational Biology and Chemistry 58 (2015) 173–181

Contents lists available at ScienceDirect

Computational Biology and Chemistry

journal homepage: www.elsevier.com/locate/compbiolchem

Research Article

MOEPGA: A novel method to detect protein complexes in yeast

protein–protein interaction networks based on MultiObjective

Evolutionary Programming Genetic Algorithm

Buwen Cao

a,b

, Jiawei Luo

a,b,∗

, Cheng Liang

a,b

, Shulin Wang

a,b

, Dan Song

a,b

College of Computer Science and Electronic Engineering, Hunan University, Changsha, China

Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China

article info

Article history:

Received 2 February 2015

Received in revised form 2 June 2015

Accepted 22 June 2015

Available online 7 July 2015

Keywords:

Protein–protein interaction (PPI) network

Protein complex

Multiobjective evolutionary

Normalized clustering score

abstract

The identiﬁcation of protein complexes in protein–protein interaction (PPI) networks has greatly

advanced our understanding of biological organisms. Existing computational methods to detect protein

complexes are usually based on speciﬁc network topological properties of PPI networks. However, due

to the inherent complexity of the network structures, the identiﬁcation of protein complexes may not

be fully addressed by using single network topological property. In this study, we propose a novel Multi-

Objective Evolutionary Programming Genetic Algorithm (MOEPGA) which integrates multiple network

topological features to detect biologically meaningful protein complexes. Our approach ﬁrst systemati-

cally analyzes the multiobjective problem in terms of identifying protein complexes from PPI networks,

and then constructs the objective function of the iterative algorithm based on three common topological

properties of protein complexes from the benchmark dataset, ﬁnally we describe our algorithm, which

mainly consists of three steps, population initialization, subgraph mutation and subgraph selection oper-

ation. To show the utility of our method, we compared MOEPGA with several state-of-the-art algorithms

on two yeast PPI datasets. The experiment results demonstrate that the proposed method can not only

ﬁnd more protein complexes but also achieve higher accuracy in terms of fscore. Moreover, our approach

can cover a certain number of proteins in the input PPI network in terms of the normalized clustering

score. Taken together, our method can serve as a powerful framework to detect protein complexes in

yeast PPI networks, thereby facilitating the identiﬁcation of the underlying biological functions.

1. Introduction

Proteins are critical components of cell activities and are essen-

tial to our understanding of molecular function as well as biological

process. In biological organisms, proteins are usually organized

into a protein complex, in which they carry out speciﬁc biological

functions cooperatively (Zhang et al., 2013). For example, proteins

YDL105W, YDR288W, YEL019C, YER038C, YLR007W, YLR383W,

YML023C, and YOL034W form a complex “Smc5-Smc6” and regu-

late both recombination and kinetochore sumoylation to promote

chromosomal maintenance during growth process (Yong-Gonzales

et al., 2012). Therefore, studies have been focused on elucidating

∗

Corresponding author. Permanent address: College of Computer Science and

Electronic Engineering, Hunan University, Changsha 410082, China.

E-mail address: luojiawei@hnu.edu.cn (J. Luo).

protein functions in PPI networks based on their interacting part-

ners (Girvan and Newman, 2002; Jin et al., 2015; Mering et al.,

2002), which further strengthen our biological knowledge on pro-

tein assembly processes for cellular organization.

Although there are a number of ways to detect protein com-

plexes experimentally, such as yeast-two-hybrid (Zhang et al.,

2013) and tandem afﬁnity puriﬁcation with mass spectrometry

(Chen and Wu, 2013; Zhang et al., 2013), they were shown to

have certain limitations (Li et al., 2005). Speciﬁcally, PPI data

derived from high-throughput experiments usually have high

false-positive and false-negative rates, which substantially affects

the accuracy of the experiment results (Chen and Wu, 2013). There-

fore, computational approaches to detect protein complexes are

developed as useful complements to the experimental methods.

Several methods based on density have been proposed to

identify densely connected subgraphs in PPI networks, where sub-

graphs with density above a pre-deﬁned threshold were considered

http://dx.doi.org/10.1016/j.compbiolchem.2015.06.006

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38739101

粉丝: 7
资源: 945

基于MOEPGA的酵母PPI网络蛋白质复合物检测新方法：融合多目标优化提升准确性

在酵母蛋白质-蛋白质相互作用网络中鉴定蛋白质复合物的细胞核附着方法

用酵母双杂交系统研究蛋白质-蛋白质相互作用ppt课件.pptx

UDoNC：一种基于蛋白质结构域和蛋白质-蛋白质相互作用网络识别必需蛋白质的算法

蛋白质相互作用网络中蛋白质复合物检测的种子扩展图聚类方法

基于蛋白质-蛋白质相互作用网络拓扑结构和复杂信息的以蛋白质为中心的鉴定必需蛋白质的新算法

D-PPIN:动态蛋白质-蛋白质相互作用网络数据集

网络游戏-一种基于蛋白质相互作用网络和网络拓扑结构特征识别蛋白质功能的方法.zip

行业分类-设备装置-一种基于差分进化算法的酵母培养在线自适应控制方法.zip

基于加权特征集合的聚类算法预测酵母蛋白质的定位位点

通过蛋白质序列的多元互信息预测蛋白质-蛋白质相互作用

最新资源