深度学习度量下的精细患者相似性测量

需积分: 10 78 浏览量更新于2024-09-09 收藏 1.55MB PDF 举报

"深度度量学习在精细粒度患者相似性测量中的应用" 本文主要探讨了利用深度度量学习（Deep Metric Learning）来实现精细粒度的患者相似性测量，这对于许多医疗保健应用，如队列研究和治疗效果比较研究，具有重要意义。传统的患者相似性测量方法主要依赖于监督学习的度量方法，从电子健康记录（EHRs）中研究患者相似性。然而，这种方法面临的一个挑战是区分具有大量细粒度疾病类别的患者。深度度量学习在细粒度图像分类问题上已经取得了显著的成功，但不能直接应用于具有层级疾病标签的患者分类。针对这一问题，论文提出了一种创新的三层患者相似性深度度量学习框架。该框架旨在更好地捕捉患者数据的复杂性和层次性，以更精确地量化患者之间的相似性。首先，深度学习模型通常包含多个层次的神经网络，这些层次能够逐步提取不同抽象级别的特征。在本文的三层次框架中，每一层可能分别对应于疾病标签的不同级别，如顶层可能对应大类疾病，中间层对应亚类，底层则对应更具体的疾病表现或症状。通过这种方式，模型可以学习到患者疾病信息的多层次表示。其次，深度度量学习的核心在于设计有效的损失函数，以优化模型在区分和聚类患者方面的性能。论文可能会介绍一种定制的损失函数，例如，多中心损失（Multi-Center Loss）或对比损失（Contrastive Loss），这些损失函数有助于拉近相似患者的距离，同时推远不相似患者的距离。此外，由于EHRs数据的特性，如非结构化信息、缺失值和异常值等，论文可能会讨论如何预处理和清洗数据，以及如何将这些信息有效地整合到深度学习模型中。这可能包括使用嵌入技术将文本信息（如诊断描述）转化为连续向量，或者采用特殊的损失函数来处理不完整数据。最后，论文可能会展示实验结果，对比提出的深度度量学习方法与传统方法在患者相似性测量上的性能。这可能包括各种评估指标，如准确率、召回率、F1分数等，并可能分析在不同疾病类别和数据子集上的表现，以证明新方法的有效性和优越性。这篇论文为医疗大数据领域提供了一个新的视角，即利用深度度量学习来解决患者相似性测量的难题，特别是在处理具有层级结构的疾病标签时。这种方法有望提高临床决策支持系统的精度，从而对医疗健康领域产生积极影响。

Fine-grained Patient Similarity Measuring

using Deep Metric Learning

Jiazhi Ni

2,3

, Jie Liu

1,2,

*, Chenxin Zhang

2,3

, Dan Ye

, Zhirou Ma

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences

Institute of Software, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China

{ nijiazhi14, ljie, zhangchenxin17, yedan, mazhirou}@otcaix.iscas.ac.cn

ABSTRACT

Patient similarity measuring1 plays a significant role in many

healthcare applications, such as cohort study and treatment

comparative effectiveness research. Existing methods mainly

rely on supervised metric learning method to study patient

similarity from Electronic Health Records (EHRs), facing the

challenge of differentiating patients with a large number of fine-

grained disease categories. Deep metric learning has gained

noticeable success in fine-grained image categorization problem,

however, it cannot be directly applied to classification of patients

with hierarchical disease labels. In this paper, we present a novel

three layer patient similarity deep metric learning framework

(PSDML) by optimizing quadruple loss improved from triplet

loss, to learn an embedding distance for disease classification

among the patients. The context semantic relation of multi

diagnosis labels encoding by ICD-10 is taken into account to

compute the supervised distance of patients. To solve the

diagnosis class imbalance, patient tuples that violate deep metric

learning framework loss constraints are chosen prior as samples

to accelerate the convergence of the neural network. We

conducted KNN multi label classification experiment using the

learned similarity metric on the real EHRs about stroke disease

collected by Chinese Stroke Data Center. The results

demonstrate substantial improvement over the baselines.

KEYWORDS

Patient Similarity, Distance Metric Learning, Deep Metric

Learning, Multi Label Classification

*Corresponding Author.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or

distributed for profit or commercial advantage and that copies bear this notice and

the full citation on the first page. Copyrights for components of this work owned

by others than ACM must be honored. Abstracting with credit is permitted. To

copy otherwise, or republish, to post on servers or to redistribute to lists, requires

prior specific permission and/or a fee. Request permissions from

Permissions@acm.org.

CIKM'17 , November 6–10, 2017, Singapore

ACM ISBN 978-1-4503-4918-5/17/11…$15.00

https://doi.org/10.1145/3132847.3133022

1 INTRODUCTION

Patient similarity measuring is a fundamental and important

task in clinical decision support applications through the

Electronic Health Records (EHRs) of outpatient, inpatient and

medical research. The goal is to derive clinically meaningful

distance metric to measure the similarity between patient pairs

represented by their key clinical indicators. Fine-grained disease

classification heavily relies on the underlying patient similarity

distance metric to correctly measuring relations of input EHRs.

Consequently, we transform the deep metric learning method of

image and speech recognition to suitable patient similarity

metric measurements.

Deep metric learning has developed much popularity recently

with remarkable success in image and speech recognition.

Compared to standard distance metric learning, it learns a

nonlinear embedding representation of the data using deep

neural networks, and it has shown a significant accuracy

improvement by learning deep representation using contrastive

loss or triplet loss in applications such as face recognition and

image retrieval. However, in the medical field, the existing

frameworks of deep metric learning based on contrastive loss or

triplet loss cannot adequately describe the patient similarity.

Employing only one negative and one positive sample ignores

interaction between other classes in each update partially raises

the problem. Because the situation of patients with multi

diagnosis labels and the diagnosis label (ICD-10 encoding) has

the context semantic relation, traditional distance metric

learning method in medical field like Locally Supervised Metric

Learning (LSML) algorithm [1] or Mahalanobis Distance cannot

work effectively in the real medical situation. Table 1

summarizes the typical information contained in our EHRs,

including medical image conclusion and multi diagnosis label

(some other events and time factors are omitted due to space

limitation), where the abbreviations are explained in Table 2. In

this work, we adopt the advanced deep metric learning method

of image field to address the following questions:

 How to get supervised information by encoding multi

diagnose labels of one patient?

 How to solve the diagnosis class imbalance problem of the

EHRs and ensure a fast deep neural network convergence?

 How to construct the deep metric learning framework with

a proper loss function definition for fine-grained disease

classification?

Session 7A: Health Analytics 1

CIKM’17, November 6-10, 2017, Singapore

1189

下载后可阅读完整内容，剩余9页未读，立即下载

Jacksonsayhi

粉丝: 0

深度学习度量下的精细患者相似性测量

Fine-Grained Crowdsourcing for Fine-Grained Recognition

Building Microservices: Designing Fine-Grained Systems

User-Click-Data-Based Fine-Grained Image Recognition viaWeakly Supervised Metric Learning

Fine-grained image classification with factorized deep user click feature

RBS-Fine-Grained Network Time Synchronization using Reference Broadcasts.PDF

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning张量点

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning预备工作

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning技术分析

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning用到的模型

最新资源