没有合适的资源?快使用搜索试试~ 我知道了~
首页深度学习在生物医学中的机遇与挑战
资源详情
资源评论
资源推荐

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/324240469
Opportunities and obstacles for deep learning in biology and medicine
ArticleinJournal of The Royal Society Interface · April 2018
DOI: 10.1098/rsif.2017.0387
CITATIONS
53
READS
448
36 authors, including:
Some of the authors of this publication are also working on these related projects:
BioCreative VI Task 5: Text mining chemical-protein interactions View project
clinical NLP View project
Daniel S Himmelstein
University of Pennsylvania
29 PUBLICATIONS476 CITATIONS
SEE PROFILE
Alexandr A. Kalinin
University of Michigan
26 PUBLICATIONS142 CITATIONS
SEE PROFILE
Stephen Woloszynek
Drexel University
11 PUBLICATIONS92 CITATIONS
SEE PROFILE
Anne Elizabeth Carpenter
Broad Institute of MIT and Harvard
241 PUBLICATIONS11,150 CITATIONS
SEE PROFILE
All content following this page was uploaded by Alexandr A. Kalinin on 10 April 2018.
The user has requested enhancement of the downloaded file.

rsif.royalsocietypublishing.org
Headline
review
Cite this article: Ching T et al. 2018
Opportunities and obstacles for deep learning
in biology and medicine. J. R. Soc. Interface 15:
20170387.
http://dx.doi.org/10.1098/rsif.2017.0387
Received: 26 May 2017
Accepted: 7 March 2018
Subject Category:
Reviews
Subject Areas:
bioinformatics, computational biology
Keywords:
deep learning, genomics, precision medicine,
machine learning
Authors for correspondence:
Anthony Gitter
e-mail: gitter@biostat.wisc.edu
Casey S. Greene
e-mail: greenescientist@gmail.com
†
Author order was determined with a
randomized algorithm.
Opportunities and obstacles for deep
learning in biology and medicine
Travers Ching
1,†
, Daniel S. Himmelstein
2
, Brett K. Beaulieu-Jones
3
,
Alexandr A. Kalinin
4
, Brian T. Do
5
, Gregory P. Way
2
, Enrico Ferrero
6
,
Paul-Michael Agapow
7
, Michael Zietz
2
, Michael M. Hoffman
8,9,10
, Wei Xie
11
,
Gail L. Rosen
12
, Benjamin J. Lengerich
13
, Johnny Israeli
14
, Jack Lanchantin
17
,
Stephen Woloszynek
12
, Anne E. Carpenter
18
, Avanti Shrikumar
15
, Jinbo Xu
19
,
Evan M. Cofer
20,21
, Christopher A. Lavender
22
, Srinivas C. Turaga
23
,
Amr M. Alexandari
15
, Zhiyong Lu
24
, David J. Harris
25
, Dave DeCaprio
26
,
Yanjun Qi
17
, Anshul Kundaje
15,16
, Yifan Peng
24
, Laura K. Wiley
27
,
Marwin H. S. Segler
28
, Simina M. Boca
29
, S. Joshua Swamidass
30
,
Austin Huang
31
, Anthony Gitter
32,33
and Casey S. Greene
2
1
Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, HI, USA
2
Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, and
3
Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of
Pennsylvania, Philadelphia, PA, USA
4
Department of Computational Medicine and Bioinformatics, University of Michigan Medical School,
Ann Arbor, MI, USA
5
Harvard Medical School, Boston, MA, USA
6
Computational Biology and Stats, Target Sciences, GlaxoSmithKline, Stevenage, UK
7
Data Science Institute, Imperial College London, London, UK
8
Princess Margaret Cancer Centre, Toronto, Ontario, Canada
9
Department of Medical Biophysics and
10
Department of Computer Science, University of Toronto, Toronto,
Ontario, Canada
11
Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA
12
Ecological and Evolutionary Signal-processing and Informatics Laboratory, Department of Electrical and
Computer Engineering, Drexel University, Philadelphia, PA, USA
13
Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh,
PA, USA
14
Biophysics Program,
15
Department of Computer Science, and
16
Department of Genetics, Stanford University,
Stanford, CA, USA
17
Department of Computer Science, University of Virginia, Charlottesville, VA, USA
18
Imaging Platform, Broad Institute of Harvard and MIT, Cambridge, MA, USA
19
Toyota Technological Institute at Chicago, Chicago, IL, USA
20
Department of Computer Science, Trinity University, San Antonio, TX, USA
21
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
22
Integrative Bioinformatics, National Institute of Environmental Health Sciences, National Institutes of Health,
Research Triangle Park, NC, USA
23
Howard Hughes Medical Institute, Janelia Research Campus, Ashburn, VA, USA
24
National Center for Biotechnology Information and National Library of Medicine, National Institutes of Health,
Bethesda, MD, USA
25
Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, FL, USA
26
ClosedLoop.ai, Austin, TX, USA
27
Division of Biomedical Informatics and Personalized Medicine, University of Colorado School of Medicine,
Aurora, CO, USA
28
Institute of Organic Chemistry, Westfa
¨
lische Wilhelms-Universita
¨
tMu¨nster, Mu¨nster, Germany
29
Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA
30
Department of Pathology and Immunology, Washington University in Saint Louis, St Louis,
MO, USA
31
Department of Medicine, Brown University, Providence, RI, USA
32
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
33
Morgridge Institute for Research, Madison, WI, USA
&
2018 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution
License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original
author and source are credited.
on April 10, 2018http://rsif.royalsocietypublishing.org/Downloaded from

TC, 0000-0002-5577-3516; DSH, 0000-0002-3012-7446;
BKB, 0000-0002-6700-1468; AAK, 0000-0003-4563-3226;
BTD, 0000-0003-4992-2623; GPW, 0000-0002-0503-9348;
EF, 0000-0002-8362-100X; P-MA, 0000-0003-1126-1479;
MZ, 0000-0003-0539-630X; MMH, 0000-0002-4517-1562;
WX, 0000-0002-1871-6846; GLR, 0000-0003-1763-5750;
BJL, 0000-0001-8690-9554; JI, 0000-0003-1633-5780; JL, 0000-0003-0811-0944;
SW, 0000-0003-0568-298X; AEC, 0000-0003-1555-8261;
AS, 0000-0002-6443-4671; JX, 0000-0001-7111-4839;
EMC, 0000-0003-3877-0433; CAL, 0000-0002-7762-1089;
SCT, 0000-0003-3247-6487; AMA, 0000-0001-8655-8109;
ZL, 0000-0001-9998-916X; DJH, 0000-0003-3332-9307;
DD, 0000-0001-8931-9461; YQ, 0000-0002-5796-7453; AK, 0000-0003-3084-2287;
YP, 0000-0001-9309-8331; LKW, 0000-0001-6681-9754;
MHSS, 0000-0001-8008-0546; SMB, 0000-0002-1400-3398;
SJS, 0000-0003-2191-0778; AH, 0000-0003-1349-4030;
AG, 0000-0002-5324-9833; CSG, 0000-0001-8713-9213
Deep learning describes a class of ma chine learning
algorithms that are capable of combining raw inputs into
layers of intermediate features. These algorithms have
recently shown impressive results across a variety of domains.
Biology and medicine are data-rich disciplines, but the
data are complex and often ill-understood. Hence, deep learn-
ingtechniquesmaybeparticularlywellsuitedtosolve
problems of these fields. W e examine applications of deep
learning to a variety of biomedical problems—patient classi-
fication, fundamental biological processes and treatment of
patients—and discuss whether deep learning will be able to
transform these tasks or if the biomedical sphere poses
unique challenges. Following from an extensive litera t ur e
revie w , we find that deep learning has yet to rev olutionize
biomedicine or definitively resolv e any of the most pressing
challenges in the field, but promising advances have been
made on the prior state of the art. Ev en though impro v ements
over previous baselines hav e been modest in gener al, the
recent progress indicates that deep learning methods will
pro vide valuable means for speeding up or aiding human
inves t iga tion. Though progress has been made linking a
specific neural network’s prediction to input featur es, under-
standing how users should interpret these models to make
testable hypotheses about the sys tem under study remains
an open challenge. Furthermore, the limited amount
of labelled da ta for tr aining pr esents problems in some
domains, as do legal and priva cy constr aints on work with
sensitive health records. Nonetheless, we foresee deep learn-
ing enabling changes at both bench and bedside with the
potential to transform sev er al areas of biology and medicine.
1. Introduction to deep learning
Biology and medicine are rapidly becoming data-intensive.
A recent comparison of genomics with social media, online
videos and other data-intensive disciplines suggests that geno-
mics alone will equal or surpass other fields in data generation
and analysis within the next decade [1]. The volume and com-
plexity of these data present new opportunities, but also pose
new challenges. Automated algorithms that extract meaningful
patterns could lead to actionable knowledge and change how
we develop treatments, categorize patients or study diseases,
all within privacy-critical environments.
The term deep learning has come to refer to a collection of
new techniques that, together, have demonstrated break-
through gains over existing best-in-class machine learning
algorithms across several fields. For example, over the past 5
years, these methods have revolutionized image classification
and speech recognition due to their flexibility and high accuracy
[2]. More recently, deep learning algorithms have shown
promise in fields as diverse as high-energy physics [3], compu-
tational chemistry [4], dermatology [5] and translation among
written languages [6]. Across fields, ‘off-the-shelf’ implemen-
tations of these algorithms have produced comparable or
higher accuracy than previous best-in-class methods that
required years of extensive customization, and specialized
implementations are now being used at industrial scales.
Deep learning approaches grew from research on artificial
neurons, which were first proposed in 1943 [7] as a model for
how the neurons in a biological brain process information.
The history of artificial neural networks—referred to as
‘neural networks’ throughout this article—is interesting in its
own right [8]. In neural networks, inputs are fed into the
input layer, which feeds into one or more hidden layers,
which eventually link to an output layer. A layer consists of a
set of nodes, sometimes called ‘features’ or ‘units’, which are
connected via edges to the immediately earlier and the
immediately deeper layers. In some special neural network
architectures, nodes can connect to themselves with a delay.
The nodes of the input layer generally consist of the variables
being measured in the dataset of interest—for example, each
node could represent the intensity value of a specific pixel in
an image or the expression level of a gene in a specific tran-
scriptomic experiment. The neural networks used for deep
learning have multiple hidden layers. Each layer essentially
performs feature construction for the layers before it. The train-
ing process used often allows layers deeper in the network to
contribute to the refinement of earlier layers. For this reason,
these algorithms can automatically engineer features that are
suitable for many tasks and customize those features for one
or more specific tasks.
Deep learning does many of the same things as more fam-
iliar machine learning approaches. In particular, deep learning
approaches can be used both in supervised applications—where
the goal is to accurately predict one or more labels or outcomes
associated with each data point—in the place of regression
approaches, as well as in unsupervised, or ‘exploratory’ appli-
cations—where the goal is to summarize, explain or identify
interesting patterns in a dataset—as a form of clustering.
Deep learning methods may, in fact, combine both of these
steps. When sufficient data are available and labelled, these
methods construct features tuned to a specific problem and
combine those features into a predictor. In fact, if the dataset
is ‘labelled’ with binary classes, a simple neural network
with no hidden layers and no cycles between units is equival-
ent to logistic regression if the output layer is a sigmoid
(logistic) function of the input layer. Similarly, for continuous
outcomes, linear regression can be seen as a single-layer
neural network. Thus, in some ways, supervised deep learning
approaches can be seen as an extension of regression models
that allow for greater flexibility and are especially well suited
for modelling nonlinear relationships among the input fea-
tures. Recently, hardware improvements and very large
training datasets have allowed these deep learning techniques
to surpass other machine learning algorithms for many pro-
blems. In a famous and early example, scientists from Google
demonstrated that a neural network ‘discovered’ that cats,
faces and pedestrians were important components of online
videos [9] without being told to look for them. What if, more
rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170387
2
on April 10, 2018http://rsif.royalsocietypublishing.org/Downloaded from

generally, deep learning takes advantage of the growth of data
in biomedicine to tackle challenges in this field? Could these
algorithms identify the ‘cats’ hidden in our data—the patterns
unknown to the researcher—and suggest ways to act on them?
In this review, we examine deeplearning’s applicationto biome-
dical science and discuss the unique challenges that biomedical
data pose for deep learning methods.
Several important advances make the current surge of
work done in this area possible. Easy-to-use software packages
have brought the techniques of the field out of the specialist’s
toolkit to a broad community of computational scientists.
Additionally, new techniques for fast training have enabled
their application to larger datasets [10]. Dropout of nodes,
edges and layers makes networks more robust, even when
the number of parameters is very large. Finally, the larger data-
sets now available are also sufficient for fitting the many
parameters that exist for deep neural networks. The conver-
gence of these factors currently makes deep learning
extremely adaptable and capable of addressing the nuanced
differences of each domain to which it is applied.
This review discusses recent work in the biomedical domain,
and most successful applications select neural network architec-
tures that are well suited to the problem at hand. We sketch out a
few simple example architectures in figure 1. If data have a natu-
ral adjacency structure, a convolutional neural network (CNN)
can take advantage of that structur e by emphasizing local
relationships, especially when convolutional layers are used in
early layers of the neural network. Other neural network archi-
tectures such as autoencoders requir e no labels and are now
regularly used for unsupervised tasks. In this review, we do
not exhaustiv ely discuss the different types of deep neural net-
work architectures; an overview of the principal terms used
herein is given in table 1. Table 1 also provides select example
applications, though in practice each neural network architec-
ture has been broadly applied across multiple types of
biomedical data. A recent book from Goodfellow et al. [11]
covers neural network architectures in detail, and LeCun et al.
[2] provide a more general introduction.
While deep learning shows increased flexibility over other
machine learning approaches, as seen in the remainder of this
review, it requires large training sets in order to fit the hidden
layers, as well as accurate labels for the supervised learning
applications. For these reasons, deep learning has recently
become popular in some areas of biology and medicine,
while having lower adoption in other areas. At the same
time, this highlights the potentially even larger role that it
may play in future research, given the increases in data in
all biomedical fields. It is also important to see it as a
branch of machine learning and acknowledge that it has the
same limitations as other approaches in that field. In particu-
lar, the results are still dependent on the underlying study
design and the usual caveats of correlation versus causation
still apply—a more precise answer is only better than a less
precise one if it answers the correct question.
1.1. Will deep learning transform the study of
human disease?
With this review, we ask the question: what is needed for deep
learning to transform how we categorize, study and treat
individuals to maintain or restore health? We choose a high
bar for ‘transform’. Grove [12], the former CEO of Intel,
coined the term Strategic Inflection Point to refer to a change
in technologies or environment that requires a business to be
fundamentally reshaped. Here, we seek to identify whether
deep learning is an innovation that can induce a Strategic
Inflection Point in the practice of biology or medicine.
There are already a number of reviews focused on appli-
cations of deep learning in biology [13–17], healthcare
[18–20] and drug discovery [4,21–23]. Under our guiding
question, we sought to highlight cases where deep learning
enabled researchers to solve challenges that were previously
considered infeasible or makes difficult, tedious analyses rou-
tine. We also identified approaches that researchers are using
to sidestep challenges posed by biomedical data. We find
that domain-specific considerations have greatly influenced
how to best harness the power and flexibility of deep learning.
Model interpretability is often critical. Understanding the
patterns in data may be just as important as fitting the data.
In addition, there are important and pressing questions about
how to build networks that efficiently represent the under-
lying structure and logic of the data. Domain experts can
play important roles in designing networks to represent data
appropriately, encoding the most salient prior knowledge
and assessing success or failure. There is also great potential
to create deep learning systems that augment biologists and
clinicians by prioritizing experiments or streamlining tasks
that do not require expert judgement. We have divided the
large range of topics into three broad classes: disease and
patient categorization, fundamental biological study and
input node
edges connecting nodes in different layers or
creating cycles within layers, correspond to
inputs to mathematical functions
FFNN MLP CNN autoencoder RNN
hidden node
output node
output node
(match input)
Figure 1. Neural networks come in many different forms. Left: A key for the various types of nodes used in neural networks. Simple FFNN: a feed-forward neural
network in which input s are connected via some function to an output node and the model is trained to produce some output for a set of inputs. MLP: the multi-
layer perceptron is a feed-forward neural network in which there is at least one hidden layer between the input and output nodes. CNN: the convolutional neural
network is a feed-forward neural network in which the inputs are grouped spatially into hidden nodes. In the case of this example, each input node is only
connected to hidden nodes alongside their neighbouring input node. Autoencoder: a type of MLP in which the neural network is trained to produce an
output that matches the input to the network. RNN: a deep recurrent neural network is used to allow the neural network to retain memory over time or sequential
inputs. This figure was inspired by the Neural Network Zoo by Fjodor Van Veen.
rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170387
3
on April 10, 2018http://rsif.royalsocietypublishing.org/Downloaded from

Table 1. Glossary.
term definition example applications
supervised learning machine learning approaches with goal of prediction of labels or outcomes
unsupervised learning machine learning approaches with goal of data summarization or pattern identification
neural network (NN) machine learning approach inspired by biological neurons where inputs are fed into one or more
layers, producing an output layer
deep neural network NN with multiple hidden layers. Training happens over the network, and consequently such
architectures allow for feature construction to occur alongside optimization of the overall training
objective
feed-forward neural
network (FFNN)
NN that does not have cycles between nodes in the same layer most of the examples below are special cases of FFNNs, except recurrent neural networks
MLP type of FFNN with at least one hidden layer where each deeper layer is a nonlinear function of each
earlier layer
MLPs do not impose structure and are frequently used when there is no natural ordering of
the inputs (e.g. as with gene expression measurements)
CNN an NN with layers in which connectivity preserves local structure. If the data meet the underlying
assumptions performance is often good, and such networks can require fewer examples to train
effectively because they have fewer parameters and also provide improved efficiency
CNNs are used for sequence data—such as DNA sequences—or grid data—such as
medical and microscopy images
recurrent neural
network (RNN)
a neural network with cycles between nodes within a hidden layer. the RNN architecture is used for sequential data—such as clinical time series and textor
genome sequences
LSTM neural network this special type of RNN has features that enable models to capture longer-term dependencies LSTMs are gaining a substantial foothold in the analysis of natural language, and may
become more widely applied to biological sequence data
autoencoder (AE) an NN where the training objective is to minimize the error between the output layer and the input
layer. Such neural networks are unsupervised and are often used for dimensionality reduction
autoencoders have been used for unsupervised analysis of gene expression data as well as
data extracted from the EHR
variational
autoencoder (VAE)
this special type of generative AE learns a probabilistic latent variable model VAEs have been shown to often produce meaningful reduced representations in the imaging
domain, and some early publications have used VAEs to analyse gene expression data
denoising autoencoder
(DA)
this special type of AE includes a step where noise is added to the input during the training
process. The denoising step acts as smoothing and may allow for effective use on input data that
is inherently noisy
like AEs, DAs have been used for unsupervised analysis of gene expression data as well as
data extracted from the EHR
generative neural
network
neural networks that fall into this class can be used to generate data similar to input data. These
models can be sampled to produce hypothetical examples
a number of the unsupervised learning neural network architectures that are summarized
here can be used in a generative fashion
RBM a generative NN that forms the building block for many deep learning approaches, having a single
input layer and a single hidden layer, with no connections between the nodes within each layer
RBMs have been applied to combine multiple types of omic data (e.g. DNA methylation,
mRNA expression and miRNA expression)
(Continued.)
rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170387
4
on April 10, 2018http://rsif.royalsocietypublishing.org/Downloaded from
剩余47页未读,继续阅读



















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0