深度学习在医学图像分析中的应用综述

需积分: 9 97 浏览量更新于2024-07-09 收藏 1.96MB PDF 举报

"这篇文档是‘医学图像分析中的深度学习研究综述’，由Geert Litjens等人撰写，发表于Medical Image Analysis期刊，详细探讨了深度学习在医疗图像分析领域的应用，特别是卷积神经网络在该领域的进展。文章回顾了重要的深度学习概念，涵盖了超过300篇相关领域的贡献，并关注了深度学习在图像分类、对象检测、分割和注册等方面的应用。" 深度学习是一种人工智能技术，近年来在各个领域，尤其是医疗图像分析中展现出强大的潜力。在医疗图像分析中，深度学习通过构建复杂的神经网络模型，可以自动学习和理解图像特征，从而实现对疾病诊断、病灶检测和治疗规划等任务的辅助。卷积神经网络（Convolutional Neural Networks, CNN）是深度学习中最常用的一种架构，特别适合处理图像数据。在医疗图像分析中，CNN可以通过多层卷积和池化操作提取图像的层次特征，从低级的边缘和纹理到高级的形状和结构。这些特征对于区分不同类型的组织、病灶或异常至关重要。文章中提到的图像分类是指将医疗图像分为预定义的类别，如良性肿瘤与恶性肿瘤。深度学习模型在此任务上的表现往往优于传统方法，因为它能自动学习区分不同类别的关键特征。对象检测则是定位并识别图像中的特定目标，如肿瘤或血管。深度学习模型，如 Faster R-CNN 或 YOLO，能够同时预测目标的位置和类别，这对于识别微小的病灶和监测其变化尤为重要。图像分割是确定图像中每个像素所属的类别，例如区分正常组织和病变区域。深度学习方法，如 U-Net 或全卷积网络（Fully Convolutional Network, FCN），在分割任务上表现出色，能精确地描绘出病灶的轮廓。最后，图像注册是指对两幅或多幅图像进行配准，以便比较或融合信息。在医疗图像分析中，这可能涉及到不同模态的图像，如MRI和CT。深度学习可以通过学习图像变换来加速和优化这个过程。这篇综述提供了深度学习在医疗图像分析领域的全面视角，强调了这一技术在提高诊断准确性和效率方面的巨大潜力。随着计算能力的增强和数据集的扩大，深度学习将继续推动医疗成像技术的发展，有望在未来为临床实践带来更多的变革。

G. Litjens et al. / Medical Image Analysis 42 (2017) 60–88 65

a hidden layer h = (h

, h

, . . . , h

) that carries the latent fea-

ture representation. The connections between the nodes are bi-

directional, so given an input vector x one can obtain the latent

feature representation h and also vice versa. As such, the RBM is

a generative model, and we can sample from it and generate new

data points. In analogy to physical systems, an energy function is

deﬁned for a particular state ( x, h ) of input and hidden units:

E(x , h ) = h

Wx − c

x − b

h , (9)

with c and b bias terms. The probability of the ‘state’ of the system

is deﬁned by passing the energy to an exponential and normaliz-

ing:

p(x , h ) =

exp {−E(x , h ) } . (10)

Computing the partition function Z is generally intractable. How-

ever, conditional inference in the form of computing h conditioned

on x or vice versa is tractable and results in a simple formula:

P (h

| x ) =

1 + exp {−b

− W

x }

. (11)

Since the network is symmetric, a similar expression holds for

P ( x

| h ).

DBNs ( Bengio et al., 2007; Hinton et al., 2006 ) are essentially

SAEs where the AE layers are replaced by RBMs. Training of the

individual layers is, again, done in an unsupervised manner. Final

ﬁne-tuning is performed by adding a linear classiﬁer to the top

layer of the DBN and performing a supervised optimization.

2.6.3. Va riational auto-Encoders and generative adverserial networks

Recently, two novel unsupervised architectures were intro-

duced: the variational auto-encoder (VAE) ( Kingma and Welling,

2013 ) and the generative adversarial network (GAN) ( Goodfellow

et al., 2014 ). There are no peer-reviewed papers applying these

methods to medical images yet, but applications in natural images

are promising. We will elaborate on their potential in the discus-

sion.

2.7. Hardware and software

One of the main contributors to the steep rise of deep learn-

ing papers has been the widespread availability of GPU and GPU-

computing libraries (CUDA, OpenCL). GPUs are highly parallel com-

puting engines, which have an order of magnitude more execution

threads than central processing units (CPUs). With current hard-

ware, deep learning on GPUs is typically 10–30 times faster than

on CPUs.

Next to hardware, the other driving force behind the popularity

of deep learning methods is the wide availability of open-source

software packages. These libraries provide eﬃcient GPU implemen-

tations of important operations in neural networks, such as con-

volutions; allowing the user to implement ideas at a high level

rather than worrying about eﬃcient implementations. At the time

of writing, the most popular packages were (in alphabetical order):

•

Caffe ( Jia et al., 2014 ). Provides C++ and Python interfaces, de-

veloped by graduate students at UC Berkeley.

•

Tensorﬂow ( Abadi et al., 2016 ). Provides C++ and Python and

interfaces, developed by Google and is used by Google research.

•

Theano ( Bastien et al., 2012 ). Provides a Python interface, de-

veloped by MILA lab in Montreal.

•

Torch ( Collobert et al., 2011 ). Provides a Lua interface and is

used by, among others, Facebook AI research.

There are third-party packages written on top of one or more

of these frameworks, such as Lasagne ( https://github.com/Lasagne/

Lasagne ) or Keras ( https://keras.io/ ). It goes beyond the scope of

this paper to discuss all these packages in detail.

3. Deep learning uses in medical imaging

3.1. Classiﬁcation

3.1.1. Image/exam classiﬁcation

Image or exam classiﬁcation was one of the ﬁrst areas in which

deep learning made a major contribution to medical image analy-

sis. In exam classiﬁcation, one typically has one or multiple images

(an exam) as input with a single diagnostic variable as output (e.g.,

disease present or not). In such a setting, every diagnostic exam is

a sample and dataset sizes are typically small compared to those

in computer vision (e.g., hundreds/thousands vs. millions of sam-

ples). The popularity of transfer learning for such applications is

therefore not surprising.

Transfer learning is essentially the use of pre-trained networks

(typically on natural images) to try to work around the (perceived)

requirement of large data sets for deep network training. Two

transfer learning strategies were identiﬁed: (1) using a pre-trained

network as a feature extractor and (2) ﬁne-tuning a pre-trained

network on medical data. The former strategy has the extra ben-

eﬁt of not requiring one to train a deep network at all, allow-

ing the extracted features to be easily plugged in to existing im-

age analysis pipelines. Both strategies are popular and have been

widely applied. However, few authors perform a thorough investi-

gation in which strategy gives the best result. The two papers that

do, Antony et al. (2016) and Kim et al. (2016a) , offer conﬂicting re-

sults. In the case of Antony et al. (2016) , ﬁne-tuning clearly outper-

formed feature extraction, achieving 57.6% accuracy in multi-class

grade assessment of knee osteoarthritis versus 53.4%. Kim et al.

(2016a) , however, showed that using CNN as a feature extractor

outperformed ﬁne-tuning in cytopathology image classiﬁcation ac-

curacy (70.5% versus 69.1%). If any guidance can be given to which

strategy might be most successful, we would refer the reader to

two recent papers, published in high-ranking journals, which ﬁne-

tuned a pre-trained version of Google’s Inception v3 architecture

on medical data and achieved (near) human expert performance

( Esteva et al., 2017; Gulshan et al., 2016 ). As far as the authors are

aware, such results have not yet been achieved by simply using

pre-trained networks as feature extractors.

With respect to the type of deep networks that are commonly

used in exam classiﬁcation, a timeline similar to computer vision

is apparent. The medical imaging community initially focused on

unsupervised pre-training and network architectures like SAEs and

RBMs. The ﬁrst papers applying these techniques for exam clas-

siﬁcation appeared in 2013 and focused on neuroimaging. Brosch

and Tam (2013) , Plis et al. (2014) , Suk and Shen (2013) , and Suk

et al. (2014) applied DBNs and SAEs to classify patients as having

Alzheimer’s disease based on brain Magnetic Resonance Imaging

(MRI). Recently, a clear shift towards CNNs can be observed. Out

of the 47 papers published on exam classiﬁcation in 2015, 2016,

and 2017, 36 are using CNNs, 5 are based on AEs and 6 on RBMs.

The application areas of these methods are very diverse, ranging

from brain MRI to retinal imaging and digital pathology to lung

computed tomography (CT).

In the more recent papers using CNNs authors also often train

their own network architectures from scratch instead of using

pre-trained networks. Menegola et al. (2016) performed some ex-

periments comparing training from scratch to ﬁne-tuning of pre-

trained networks and showed that ﬁne-tuning worked better given

a small data set of around a 10 0 0 images of skin lesions. However,

these experiments are too small scale to be able to draw any gen-

eral conclusions from.

Three papers used an architecture leveraging the unique at-

tributes of medical data: two use 3D convolutions ( Hosseini-Asl

et al., 2016; Payan and Montana, 2015 ) instead of 2D to classify pa-

tients as having Alzheimer; Kawahara et al. (2016b) applied a CNN-

剩余28页未读，继续阅读

weixin_50698885

粉丝: 0
资源: 1

深度学习在医学图像分析中的应用综述

Deep Learning in Medical Image Analysis and Multimodal Learning最新智能医疗专著

Deep Learning for Medical Image Analysis 无水印pdf

A_Survey_on_Deep_Learning_in_Medical_Image_Analysis.pdf

【Advanced】Image Depth Estimation in MATLAB: Using Deep Learning for Image Depth Estimation

【Basics】Image Reading and Display in MATLAB: Reading Images from File and Displaying Them

Evaluation Methods for Unsupervised Learning: Assessing the Performance of Clustering Algorithms

linux基础进阶笔记

IMG20241115211541.jpg

Sen2_ARI_median.txt

毕业设计&课设_基于 flask-whoosh-jieba 的代码，涉及文件管理及问题修复.zip

最新资源