基于深度学习的中医临床实体时间索引研究

51 浏览量更新于2024-08-27 收藏 1.48MB PDF 举报

"中国临床笔记中医学实体的时间索引" 这篇研究论文主要关注的是在中文临床笔记中对医学实体进行时间索引的技术。时间索引是医学信息处理领域的一个重要任务，其目标是为临床笔记中的每个医学实体选择一个发生的时间或时间区间。这样做的目的是将所有医学实体按照统一的时间线进行索引，以便更好地理解临床笔记内容，并促进医学实体的进一步应用。过去几年，英文临床笔记中医学实体的时间关系识别已经有一些共享任务，如2012年的i2b2 NLP挑战、2015年和2016年的临床TempEval挑战。这些任务促进了基于启发式规则和机器学习方法的系统开发。然而，随着深度神经网络模型在包括关系分类在内的多个问题上展现出巨大潜力，研究者开始探索这些模型在时间索引任务中的应用。论文中，作者提出了一个循环卷积神经网络（RNN-CNN）模型用于执行时间索引任务。RNN（循环神经网络）通常用于处理序列数据，能够捕获上下文依赖，而CNN（卷积神经网络）则擅长捕捉局部特征。结合这两种模型，RNN-CNN模型可能能够有效地提取临床笔记中的时间信息并进行准确的索引。实验部分，作者可能会对比RNN-CNN模型与其他已有的方法，如基于规则的系统和传统的机器学习模型，以验证其性能。评估指标可能包括精确度、召回率和F1分数等。此外，论文可能还会探讨模型的优化策略，如训练技巧、超参数调整以及可能的数据预处理步骤，以提高模型的泛化能力。这篇研究论文贡献了一种新的深度学习方法，用于解决中文临床文本中医学实体的时间索引问题，这对于提升医疗信息的理解和分析效率具有重要意义。同时，这一工作也为后续的医学自然语言处理研究提供了新的思路和工具。

Candidate selection

As utilizing the pair-wise method to select the temporal

index for medical entity, we need pair the medical entity

with each temporal expression in the clinical note. How-

ever, most of these temporal expressions are not related

with current medical entity, which produces many nega-

tive samples and causes the data imbalance problem.

Therefore, we construct a candidate selection module to

generate a much small candidate set for each medical

entity. By analyzing the Chinese clinical note s, we find

that there are some constant sections in each clinical

note, such as the “chief complaint”, “history of present

illness”, etc., and these sections are independent of each

other. Besides, the occurred times of these sections also

are much different, for instance, the medical entities in

“history of present illness” section are almost occurred

before the admission time, and which in “conditions in

discharge” section are occurred at the discharge time.

Table 1 lists some sections and their occurred times we

summarize in the Chinese clinical notes. Based on this

observation, we select the section time and all tempor al

expressions in corresponding section as the candidate

times for each medical entity. Figure 2 shows the main

flow for the candidate selection. For each medical entity,

we first decide the section it belongs to, then further col-

lect all temporal expressions in this section and corre-

sponding section time (as Table 1) as the final candidates

of this medical entity.

Figure 3 shows an example for the candidate selection,

in which a “History of present illness” section is presented

with the section time (also is the admission time), other

four temporal expressions, and five medical entities. Ac-

cording to our candidate selection strategy, the section

time and four temporal expressions are all selected as the

candidates for each medical entity in this section.

Temporal relation classification

After the candidate selection step, we need further clas-

sify the temporal relation (NONE, SIMULTANEOUS,

BEFORE and AFTER) of each pair of medical entity and

temporal expression, which turn into a relation cla ssifi-

cation problem. In this paper, a recurrent convolutional

neural network (RNN-CNN) model was proposed for

this task, as shown in Fig. 4, which contains four main

layers: 1) input layer, which takes the sentence of med-

ical entity, the sentence of temporal expression and

temporal relation features as input, generates the repre-

sentation of each word in a sentence and the representa-

tion of features. 2) LSTM layer, which includes a

forward LSTM and a backward LSTM, takes the word

representation sequence of a sentence as input , and out-

puts a new word representation sequence that captures

the context information of each word in this sentence. 3)

CNN layer, which takes the word representation sequence

of a sentence outputted by LSTM layer as input, generates

the representation of medical entity or temporal expres-

sion. 4) Output layer, which concatenate the representa-

tion of medical entity, temporal expression and relation

features together by a hidden layer, and predicts the type

of temporal relation by the softmax function.

The main purpose of L STM and CNN used in our

model is to learn a representation for each medical en-

tity and temporal expression from their corresponding

sentences. The LSTM is used to learn the context infor-

mation of each word and the long-distance dependences

between words. In other word, the representation of

each word output by LSTM not only contains the par-

ticular information of current word, but also implies the

global information of the sentence. Then, the CNN is

applied to extract the significant features from word rep-

resentation sequence, eliminate the noise and redundant

information, and finally generate a representation for

each medical entity and temporal expression. More de-

tail introduction of above four layers will present in the

following sections.

Input layer

Our RNN-CNN model learns the representations of

medical entity and temporal expression from the sen-

tences where they belong to respective ly. Give a sen-

tence S =(w

, w

, ⋯, w

) with each word w

(1 ≤ t ≤ n),

which contains the medical entity (or temporal expres-

sion) word w

, P =(d

, d

, ⋯, d

) is the sequence of posi-

tions for each word, where d

= t − k (1 ≤ t ≤ n). Then, the

representation x

of word w

can be calculated by:

¼ E

∙ w

!

; E

∙d

ð1Þ

Where E

and E

are the embedding matrixes for

words and positions, w

!

and d

are the one-hot vectors

for word w

and position d

respectively. The E

matrix

Table 1 Sections and corresponding occurred times in the Chinese clinical notes

Section name Occurred time Relation

Chief complaint, History of present illness, Past medical history, Personal history, Conditions in admission Admission Before

Physical examination, Assistant examination, Preliminary diagnosis, Diagnosis on admission Admission Simultaneous

Diagnosis and treatment

Conditions in discharge, Diagnosis on discharge

Discharge orders

Admission

Discharge

After

Simultaneous

After

Liu et al. BMC Medical Informatics and Decision Making 2019, 19(Suppl 1):17 Page 33 of 71

剩余10页未读，继续阅读

weixin_38610815

粉丝: 4
资源: 870

基于深度学习的中医临床实体时间索引研究

中医临床护理学笔记.pdf

中医临床护理学核心笔记.pdf

《中医学》背诵重点笔记.pdf

中医诊断学笔记.doc

中医内科学笔记.doc

Javascript 读书笔记索引贴

Oracle学习笔记(索引)

mysql高性能索引读书笔记

中医养生康复学-笔记.pdf

oracle笔记其它数据库对象(序列索引同义词)

最新资源