Physica Medica 83 (2021) 122–137
and time-of-flight (TOF) resolutions in PET. Image reconstruction al-
gorithms are being revisited through the introduction of deep learning
algorithms wherein the whole image reconstruction process or certain
critical components (analytical models) are being replaced by machine
learning models. A large body of literature is dedicated to quantitative
SPECT and PET imaging aiming at reducing the impact of noise, artifact,
and motion, or to correct for physical degrading factors, including
attenuation, Compton scattering, and partial volume effects. The lack of
straightforward techniques for generation of the attenuation map on
organ-specific standalone PET scanners or hybrid PET/MRI systems
inspired active scientists in the field to devise suitable strategies to
enhance the quantitative potential of molecular imaging. High-level
image processing tasks, such as segmentation, data interpretation,
image-based diagnostic and prognostic models as well as internal
dosimetry based on SPECT or PET imaging have substantially evolved
owing to the formidable power and versatility of deep learning
algorithms.
AI/DL-based solutions have been proposed to undertake certain tasks
belonging to the long chain of processes involved in image formation,
analysis, and extraction of quantitative features for the development of
disease-specific diagnosis/prognosis models from SPECT and PET im-
aging. In this review, the applications of AI/DL in these imaging mo-
dalities are summarized in six key sections focusing on the major
challenges/opportunities and seminal contributions in the field. A
concise overview of machine learning methods, in particular deep
learning approaches, is presented in section 2. The following section
describes AI-based techniques employed in PET instrumentation, image
acquisition and formation, image reconstruction and low-dose scanning,
quantitative imaging (attenuation and scatter corrections), image anal-
ysis and computer-aided detection/diagnosis/prognosis, as well as in-
ternal radiation dosimetry. The last section provides in perspective the
major challenges and opportunities for AI/DL-based solutions in PET
and SPECT imaging.
Principles of machine learning and deep learning
Machine learning algorithms are considered as a subset of non-
symbolic artificial intelligence, which tends to automatically recognize
a pattern and create/extract a desirable representation from raw data
[4]. In machine learning algorithms, the system attempts to learn certain
patterns from the extracted features. Likewise, in deep learning algo-
rithms, a subtype of machine learning techniques, feature extraction,
feature selection, and ultimate tasks of classification or regression are
carried out automatically in one step [5]. Different deep learning algo-
rithms have been proposed and applied in nuclear medicine [2,6],
including convolutional neural networks (CNNs) [7,8] and generative
adversarial networks (GANs) [5]. Some applications of machine
learning algorithms, such as classification, segmentation, and image-to-
image translation, have attracted more attention [9].
A number of deep learning architectures became popular in the field
of medical image analysis, including convolutional endcoders-decoders
(CED) networks consisting of encoder and decoder parts designed to
convert input images to feature vectors and feature vectors to target
images, respectively [8]. In addition, GANs consist of two major com-
ponents: a generator, mostly a CED network, and a discriminator, a
classifier to differentiate the ground truth from the synthetic images/
data [8]. Different architectures based on these models were developed
and applied on medical images for different tasks, including image
segmentation and image to image translation [10]. U-Net [11] is one the
most popular architectures built upon the CED structure via adding
some skip connections for context capturing and for creating a sym-
metric expanding path, which enables more efficient feature selection.
Upgrading networks with different modules, such as attention blocks/
components [12] for highlighting salient features in the input data, and
residual connections [13] to prevent gradient vanishing, are intended to
improve the overall performance of the networks. Conventional
GAN
architectures have been upgraded in different ways, leading to condi-
tional GAN (cGAN) [14] and cycle consistency GANs (Cycle-GAN) [15]
models, which consist of a CED in the generator and discriminator
components and task-specific loss functions. Cycle-GAN [15] is an un-
supervised model for image-to-image transformation, which does not
require paired (labeled) datasets. In the Cycle-GAN model, two gener-
ator and discriminator components are jointly involved in the training
process, wherein images from two different domains are used as input
and output within a cycle consistency scheme. In the cycle consistency
scheme, the output of the generator component is used as input and vise
versa with the calculated loss between the input and output acting as
regularization of the generator model [15].
Overall, deep learning-based algorithms outperformed conventional
approaches in various applications [5]. AI-based approaches, especially
deep learning algorithms, do not require handcraft features extraction,
specific data preprocessing, or user intervention within the learning and
inferring processes [5]. The major applications of deep learning ap-
proaches in SPECT and PET imaging are summarized in Fig. 1. Deep
learning methods face many challenges, including the fact that they are
data hungry, require high computation burden for the training process,
and their black box nature (which hampers systematic analysis of their
operation/performance) [7]. To reach peak performance, these algo-
rithms require a large number of clean and cured datasets for the
training process. However, data collection remains the main challenge
owing to patients’ privacy and complexity of ethical issues. Moreover,
task-specific deep learning algorithms (i.e. for a particular organ/body
region or radiotracer) are able to exhibit superior performance
compared to more general models which are commonly more sensitive
to variability in image acquisition protocols and reconstruction settings.
Another challenge faced by the application of deep learning algorithms
in medical imaging is the high computational burden owing to the large
size of clinical data in terms of number of subjects and individual images
(large 3-dimensional images or sinograms) which might cause memory
or data management issues.
Applications of deep learning in SPECT and PET imaging
Instrumentation and image acquisition/formation
Detector modules play a key role in the overall performance achieved
by PET scanners. An ideal PET detector should have a good energy and
timing resolution and capable of accurate event positioning. Energy
resolution is a metric that determines how accurately a detector can
identify the energy of incoming photons and as a result, distinguish
scatter and random photons from true coincidences. These parameters
affect the scanner’s sensitivity, spatial resolution, and signal-to-noise
ratio (true coincidence versus scatters or randoms). Despite significant
progress in PET instrumentation, there are a number of challenges that
still need to be addressed and where machine learning approaches can
offer alternative solutions to complex and multi-parametric problems.
Accurate localization of the interaction position inside the crystals
improves the overall spatial resolution of PET scanners. Since optical
photons distribution is stochastic, particularly near the edges of the
crystal, and owing to multiple Compton scattering and reflection, ac-
curate positioning of the interaction within the crystal is challenging. In
comparison with other positioning algorithms, such as Anger logic and
correlated signal enhancement, which rely on determination of the
centre of gravity, machine learning algorithms led to a better position
estimation particularly at the crystal edges [16]. In this regard, Peng
et al. trained a CNN classifier that was fed with signals from each Silicon
photomultiplier’s channel to the coordinates of the scintillation point for
a quasi-monolithic crystal [17]. Another study applied a multi-layer
perceptron to predict the 3D coordinates of the interaction position in-
side a monolithic crystal and compared the performance of this posi-
tioning algorithm with anger logic for a preclinical PET scanner based on
NEMA NU4 2008 standards [18]. Fig. 2 depicts the adopted deep