双向RNN在电子健康记录诊断预测中的应用

168 浏览量更新于2024-08-26 收藏 765KB PDF 举报

"递归神经网络在电子健康记录中的应用，用于预测患者的未来健康状况，尤其是在处理历史电子健康记录时，能预测疾病和用药情况。然而，传统的递归神经网络（RNN）在序列长度较大时表现下降，并忽视了患者本身的某些特性。为此，提出了使用双向RNN来记住过去和未来的就诊信息，并将患者特征作为辅助信息加入模型，实验证实在真实世界电子健康记录数据集上，该模型显著提高了预测准确性，提供了具有临床意义的结果。" 在医疗领域，利用人工智能进行疾病预测是当前个性化医疗研究的重要任务。电子健康记录（EHR）数据，即患者随时间推移的一系列就诊记录，包含了诊断、用药和患者基本信息等多个方面。这些数据对于预测患者未来的健康状况至关重要。传统的方法通常采用递归神经网络（RNN），RNN因其处理序列数据的能力而被广泛应用。然而，RNN在处理长序列数据时存在性能下降的问题，这主要是由于长期依赖问题（Long-term Dependency Problem）。此外，RNN模型往往未能充分考虑患者个体差异这一重要因素。为解决这些问题，研究者提出了一种新的方法，即使用双向RNN（Bidirectional RNN, Bi-RNN）。Bi-RNN能够同时捕捉序列的前向和后向信息，从而更好地捕获就诊序列中的上下文关联。在模型改进中，研究者还引入了患者的特征作为侧信息，这种做法有助于模型更全面地理解患者的状态。通过这种方式，模型不仅考虑了过去的就诊信息，还预估了未来可能的健康趋势，使得预测结果更加准确。实验结果显示，提出的Bi-RNN模型在实际的电子健康记录数据集上相比于传统的诊断预测方法有显著提升，这意味着它在临床上有更大的实用价值。这种提升的预测准确性对于早期识别疾病风险、优化治疗方案以及改善患者预后具有重要意义。因此，这种结合患者特征的双向RNN模型为医疗预测带来了新的突破，为医疗决策支持系统提供了更强大、更个性化的工具。



Abstract—

The prediction of patient’s future health

information from the historical electronic health records (EHR)

forms the core of the development of personalized healthcare

research tasks. Patient EHR data consists of sequences of visits

over time, where each visit contains multiple medical codes,

including diagnosis, medication, and patient profile. Using

historical data from the EHR, we can predict medical

conditions and medication uses. Existing works model EHR

data by using recurrent neural networks (RNNs). However,

RNN-based approaches have certain limitations: the

performance of RNNs drops when the length of sequences is

large and they ignore some of the characteristics of the patients

themselves. We propose an application of using bidirectional

RNNs to remember all the information of both the past and

future visits and add some patient’s characteristics as side

information into this model. Experimental results on real

world EHR datasets show that the proposed model can

remarkably improve the prediction accuracy when compared

with the diagnosis prediction approaches, and it can provide

clinically meaningful interpretation.

Index Terms—Component, electronic health records,

bidirectional recurrent neural networks, side information

I. INTRODUCTION

The common challenge in smart health is how to use the

large amount of data in predicting visiting patients’ diseases

in a short period of time. Due to complicated processes,

different symptoms, and pathological tests, making the

correct diagnosis is a difficult task and causes delays in

providing the proper treatment. Electronic health records

(EHR) consisting of patient health data, including

demographics, diagnoses, procedures, and medications, have

been utilized successfully in several predictive modeling

tasks in healthcare [1]-[3]. EHR data are temporally

sequenced by patient medical visits that are represented by a

set of high dimensional clinical variables (i.e., medical

codes). While forecasting medical models have been

developed to predict the expected demand, most of the

existing works have focused on specialized forecasting

models or a single target. In order to model the sequential

EHR data, recurrent neural networks (RNNs) are used in the

literature to obtain accurate and robust representations of

patient visits in diagnostic predictive tasks [3], [5].

However, the predictive power of these models drops

significantly when the length of the patient visit sequences is

large. Further, these models usually ignore some of the

Manuscript received January 8, 2018; revised March 19, 2018.

The authors are with College of information Science &Technology,

Hainan University, Haikou, China and State Key Laboratory of marine

resource utilization in the South China Sea, Hainan University (e-mail:

muyangzi521@163.com, huangmx09.com, cyye@ustc.com,

wuqingzhou@21cn.com).

characteristics of the patients themselves and others. While

not so extreme, there are many diseases associated with

gender, family history, region, season, and so on.

Bidirectional recurrent neural networks (BRNNs) [6], which

can be trained using all the available input information in

the past and future, have been used to alleviate the problem

of long sequences, thereby improving the predictive

performance. Referring to the method of collaborative

filtering (CF), we use the side information to reasonably

interpret the importance of patients and medical codes in the

prediction results. This side information can be obtained

from the user profile and other information. Some side

information has proven to be useful for heart disease

decisions [7], [8]. Some hybrid CF methods have gained

popularity in recent years [9], [10], where side information

is integrated into matrix factorization to learn the effective

latent factors.

We demonstrate that the proposed model achieves

significantly higher prediction accuracy when compared to

the other approaches in diagnosis prediction, using our

datasets from Haikou People’s Hospital. In summary, our

main contributions are as follows:

 We propose a new, end-to-end, simple, and powerful

model that can accurately predict future visits, without

relying on any expert’s medical knowledge.

 It models the patient’s visit information in time- and

reverse-time-ordered ways and employs side information

as supplementary information.

 We show that the proposed new model outperforms

existing methods in diagnosis prediction with regard to

EHR datasets.

The rest of this paper is organized as follows: In Section

II, we discuss the connection between the proposed

approaches and related works. Section III details the

proposed new model. The experimental results are given in

Section IV. Section V concludes this paper.

II. RELATED WORKS

This part reviews the existing work for mining EHR data.

In particular, it focuses on several state-of-the-art models on

diagnosis prediction tasks. It also includes some works that

use side information in CF-based methods.

A. EHR Data Mining

Mining EHR data is a popular topic in medical informatics.

The investigated tasks include electronic genotyping and

phenotyping [11], [12], disease progression [13], [14],

diagnosis prediction [1], [2], [15], and so on. In most of these

tasks, the machine learning model and depth neural network

models can significantly improve the performance.

Diagnosis prediction is an important and difficult task in

medical informatics. Machine learning can remarkably

improve performance, such as using the SVM algorithm in

Diagnosis Prediction via Recurrent Neural Networks

Yangzi Mu, Mengxing Huang, Chunyang Ye, and Qingzhou Wu

International Journal of Machine Learning and Computing, Vol. 8, No. 2, April 2018

117

doi: 10.18178/ijmlc.2018.8.2.673

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38713203

粉丝: 11
资源: 942

双向RNN在电子健康记录诊断预测中的应用

递归神经网络,递归神经网络和循环神经网络,matlab

基于简单递归神经网络的引信故障预测算法.pdf

递归神经网络_Neuraldiagnosis_神经网络故障_recurrentnetwork_递归神经网络_matlab神经_源

【RNN数据预测】时间反向传播 (BPTT) 训练RNN递归神经网络预测【含Matlab源码 2434期】.zip

利用H&E图像结合自我注意多实例学习和递归神经网络预测可解释的端到端前列腺癌复发_Towards Explainable End

基于卷积递归神经网络的血压测量模型.pdf

retain-keras:RETAIN递归神经网络在Keras中的重新实现

基于Matlab的RNN递归神经网络数据预测与BPTT训练方法

使用长期记忆递归神经网络的旋转机械故障诊断方法

端到端前列腺癌复发预测：H&E图像的自我注意多实例学习与递归神经网络

最新资源