深度学习在嵌套生物医学实体关系提取中的应用

163 浏览量更新于2024-08-28 收藏 439KB PDF 举报

"这篇研究论文‘Extracting Nested Biomedical Entity Relations by Tagging Dependency Chains’探讨了在生物医学领域中提取嵌套实体关系的新方法。通过依赖解析技术，该方法能够提取包含生物医学实体（触发器/参数）链的目标序列。然后，利用条件随机场（CRFs）模型对这些实体链进行标记，以表示嵌套的参数-触发器关系。最后，进行后处理步骤以确定事件的关系。" 在这篇论文中，作者们聚焦于一个关键的生物医学文本挖掘任务——生物医学事件提取。这是一个复杂的过程，涉及到识别文本中的关键实体（如疾病、基因、蛋白质等）及其相互作用，这对于理解生物学过程和疾病机制至关重要。然而，现有的事件提取系统在处理嵌套事件（即一个事件内部包含另一个事件的情况）时仍存在挑战。为了解决这个问题，论文提出了一个新颖且高效的方法。首先，他们运用依赖解析技术来分析句子结构，找出包含生物医学实体链的序列。依赖解析是一种句法分析技术，它揭示了词汇项之间的依赖关系，帮助确定哪些词在语义上相互关联。接着，研究人员应用条件随机场（CRFs）模型对这些实体链进行标注。CRFs是一种统计建模方法，常用于序列标注任务，如命名实体识别和词性标注。在这里，CRFs被用来识别出那些表示事件触发和参数之间关系的标记，从而捕捉到嵌套事件的精细结构。在应用CRFs之后，论文中提到进行了后处理步骤，这可能包括冲突解决和关系推理，以确保识别出的事件关系准确无误。这一阶段是必要的，因为单一的模型可能无法完全捕捉到复杂文本中的所有关系。这篇论文提出的依赖链标注方法为解决生物医学文本中的嵌套实体关系提供了一个创新途径，有助于提高事件提取的准确性和完整性，对于生物医学信息学领域的研究具有重要意义。此方法可以潜在地改进现有的信息抽取系统，并促进生物医学研究和临床决策支持系统的开发。

Journal of Engineering Science and Technology Review 8 (4) (2015) 51-55

Research Article

Extracting Nested Biomedical Entity Relations by Tagging Dependency Chains

Xiaomei Wei

1,2

, Yu Huang

, Chen Lyu

1,3

, and Donghong Ji

1,*

Computer School, Wuhan University, Wuhan 430072, China

College of informatics, Huazhong Agriculture University, Wuhan 430070, China

Singapore University of Technology and Design, 138682, Singapore

Received 24 June 2015; Accepted 14 October 2015

___________________________________________________________________________________________

Abstract

Biomedical event extraction is an important research topic in the field of biomedical text mining. However, much

research work is required before event extraction systems become applicable. Thus, we proposed a novel and efficient

approach for extracting nested biomedical events. First, using dependency parsing, we extracted the target sequences that

contained biomedical entity (trigger/argument) chains. Second, the Condition Random Fields (CRFs) model was used to

tag the entity chains which represented the nested argument-trigger edges. Thirdly, the post-processing step was used to

output the events. This method is a new attempt to treat the biomedical event extraction as a sequence tagging problem.

The experiment results showed that we got the performance of 47.3 in F-score which is promising when compared with

the joint ML-based system in BioNLP-ST2013. Furthermore, we estimated the results of the trigger detection, which

outperformed the state-of–the-art systems on the same corpus. Therefore, our work is a positive contribution to the

biomedical text mining community.

Keywords: Joint; Event extraction; Entity chain; Dependency; Tag

__________________________________________________________________________________________

1. Introduction

Biomedical event extraction has become an important

research topic in the field of biomedical natural language

processing in recent years [1]. Biomedical events describe

the fine-grained relations among biomedical entities. The

biomedical literature contains substantial information

regarding relations among biomedical entities, and these

relations must be extracted to construct a knowledge

database for researchers. This effort led to the BioNLP GE

shared task (BioNLP-ST, hereafter) series [2-4], which aims

to extract nested bio-molecular events from biomedical text.

BioNLP-ST addressed nine types of biomedical molecular

events related to protein biology. These events can be

grouped into three categories: Simple, Binding, and

Regulation. Simple events (Gene_expression, Transcription,

Protein_catabolism, Phosphorylation, Localization) take one

protein argument. Binding events (Binding) have one or

more protein arguments. Regulation events

(Positive_regulation, Negative_regulation and Regulation)

have one obligatory Theme and one optional Cause

argument. Each argument of Regulation events could be

either a protein or another event. A Regulation event is

considered nested if it has another event as its argument. A

sample of an event annotation of a sentence (Sen.1) from

training corpus is illustrated in Fig. 1.

Sen.1: BMP-6 did not induce significant changes in the

protein expression of Id2 and Id3.

In this sentence, the trigger words are presented in bold

font, whereas the protein arguments are expressed in

underline font. In the definition of BioNLP09-ST [2], both

triggers and arguments are called entities. In the upper

textbox of the figure, proteins “BMP-6”, “Id2”, and “Id3”

are labeled as T73, T74, and T75, respectively. In the lower

textbox, T50 and T51 are two labels of triggers, and E27 and

E28 are two events.

Biomedical event extraction is a complex task that

requires study before being applied. The complexity of event

extraction rests on two aspects. First, the sentences in the

biomedical literature are typically very complex. Second,

many biomedical events are nested and are thus different

from the event definition in the common field, such as the

ACE2005 [5] event task. As shown in Fig.1, event E79

contains the trigger word T169 and the protein argument

T74. Meanwhile, event E79 is the argument of another event

E76. Therefore, event E76 is a nested event while it is the

argument of event E75. When multiple nested layers exist,

extracting events becomes more difficult because errors in

the lower layers could lead to errors in the upper layers.

2. Related works

To date, researchers have proposed many experimental

methods to extract biomedical event based on

______________

* E-mail address: may@mail.hzau.edu.cn

rights reserved.

JOURNAL OF

Engineering Science

and Technology Review

www.jestr.org

estr

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38551143

粉丝: 3
资源: 937

深度学习在嵌套生物医学实体关系提取中的应用

Extracting the comparative relations for mobile reviews

XAI generates the stable interpretation by extracting and com- The interpretation evaluation metrics for both InceptionV3 bine the high contributed pixel features from Grad-CAM++XAIand VGG16 using the public dataset is in accordance with the and SHAP. 请分析一下这个句子结构

@RequestBody

extracting training data from diffusion models

error while extracting response for type

最新资源