Candidate selection
As utilizing the pair-wise method to select the temporal
index for medical entity, we need pair the medical entity
with each temporal expression in the clinical note. How-
ever, most of these temporal expressions are not related
with current medical entity, which produces many nega-
tive samples and causes the data imbalance problem.
Therefore, we construct a candidate selection module to
generate a much small candidate set for each medical
entity. By analyzing the Chinese clinical note s, we find
that there are some constant sections in each clinical
note, such as the “chief complaint”, “history of present
illness”, etc., and these sections are independent of each
other. Besides, the occurred times of these sections also
are much different, for instance, the medical entities in
“history of present illness” section are almost occurred
before the admission time, and which in “conditions in
discharge” section are occurred at the discharge time.
Table 1 lists some sections and their occurred times we
summarize in the Chinese clinical notes. Based on this
observation, we select the section time and all tempor al
expressions in corresponding section as the candidate
times for each medical entity. Figure 2 shows the main
flow for the candidate selection. For each medical entity,
we first decide the section it belongs to, then further col-
lect all temporal expressions in this section and corre-
sponding section time (as Table 1) as the final candidates
of this medical entity.
Figure 3 shows an example for the candidate selection,
in which a “History of present illness” section is presented
with the section time (also is the admission time), other
four temporal expressions, and five medical entities. Ac-
cording to our candidate selection strategy, the section
time and four temporal expressions are all selected as the
candidates for each medical entity in this section.
Temporal relation classification
After the candidate selection step, we need further clas-
sify the temporal relation (NONE, SIMULTANEOUS,
BEFORE and AFTER) of each pair of medical entity and
temporal expression, which turn into a relation cla ssifi-
cation problem. In this paper, a recurrent convolutional
neural network (RNN-CNN) model was proposed for
this task, as shown in Fig. 4, which contains four main
layers: 1) input layer, which takes the sentence of med-
ical entity, the sentence of temporal expression and
temporal relation features as input, generates the repre-
sentation of each word in a sentence and the representa-
tion of features. 2) LSTM layer, which includes a
forward LSTM and a backward LSTM, takes the word
representation sequence of a sentence as input , and out-
puts a new word representation sequence that captures
the context information of each word in this sentence. 3)
CNN layer, which takes the word representation sequence
of a sentence outputted by LSTM layer as input, generates
the representation of medical entity or temporal expres-
sion. 4) Output layer, which concatenate the representa-
tion of medical entity, temporal expression and relation
features together by a hidden layer, and predicts the type
of temporal relation by the softmax function.
The main purpose of L STM and CNN used in our
model is to learn a representation for each medical en-
tity and temporal expression from their corresponding
sentences. The LSTM is used to learn the context infor-
mation of each word and the long-distance dependences
between words. In other word, the representation of
each word output by LSTM not only contains the par-
ticular information of current word, but also implies the
global information of the sentence. Then, the CNN is
applied to extract the significant features from word rep-
resentation sequence, eliminate the noise and redundant
information, and finally generate a representation for
each medical entity and temporal expression. More de-
tail introduction of above four layers will present in the
following sections.
Input layer
Our RNN-CNN model learns the representations of
medical entity and temporal expression from the sen-
tences where they belong to respective ly. Give a sen-
tence S =(w
1
, w
2
, ⋯, w
n
) with each word w
t
(1 ≤ t ≤ n),
which contains the medical entity (or temporal expres-
sion) word w
k
, P =(d
1
, d
2
, ⋯, d
n
) is the sequence of posi-
tions for each word, where d
t
= t − k (1 ≤ t ≤ n). Then, the
representation x
t
of word w
t
can be calculated by:
x
t
¼ E
w
∙ w
t
!
; E
d
∙d
t
!
hi
ð1Þ
Where E
w
and E
d
are the embedding matrixes for
words and positions, w
t
!
and d
t
!
are the one-hot vectors
for word w
t
and position d
t
respectively. The E
w
matrix
Table 1 Sections and corresponding occurred times in the Chinese clinical notes
Section name Occurred time Relation
Chief complaint, History of present illness, Past medical history, Personal history, Conditions in admission Admission Before
Physical examination, Assistant examination, Preliminary diagnosis, Diagnosis on admission Admission Simultaneous
Diagnosis and treatment
Conditions in discharge, Diagnosis on discharge
Discharge orders
Admission
Discharge
Discharge
After
Simultaneous
After
Liu et al. BMC Medical Informatics and Decision Making 2019, 19(Suppl 1):17 Page 33 of 71