3. We perform experiments on a wide range of population with various
body types, including normal, overweight, and obese subjects. The
experimental results demonstrate the generality of our algorithms.
Besides, it takes about 2 s to segment a new 2D abdominal image,
which indicates a potential clinical applications.
The rest of this paper is organized as follows. In Section 2 we give a
brief review of related work. The proposed algorithm is described in
Section 3. Then, we present the data set for evaluation, performance
metrics and experiments in Section 4. Finally, discussion and conclu-
sion are in Section 5.
2. Related work
In this section, we introduce the related work from three aspects:
the formation of images used, previous AAT segmentation algorithms,
and the recent proposed algorithms applied to medical image proces-
sing. Previous automatic algorithms of AAT segmentation are mainly
based on computed topography (CT) images [12] and MR images
[11,10,8,6,7]. However, CT exposes the patient to ionizing radiation,
limiting its clinical use to diagnosing acute patient illnesses. In our
work, we make the analysis on MR images, which delivers no ionizing
radiation.
Most existing AAT segmentation algorithms based on MR images
only use raw features or hand-crafted features, such as intensity, shape
features or location features, without exploring the intrinsic character-
istics of tissues. The first kind of algorithms based on intensity features
included: fuzzy c-means clustering, K-means clustering, and image
histogram, which are strongly depended on the image quality. Fuzzy c-
means [10] and K-means clustering algorithms [11] cluster all the
pixels into SAT, VAT or NAT based on the intensity similarities.
However, MR images suffer from relative intensity scale, inhomoge-
neous image intensities, and artifacts that will degenerate the accuracy
of intensity-based algorithms. The second kind of algorithms are based
on the shapes and locations features, which include graph cut
algorithms [6,7] and active contour algorithm [8]. Graph cut algo-
rithms construct the graph based on pixel intensity and location
information, and find the internal boundary of SAT by min cut
algorithms, which have problems with segmenting thin elongated
tissues due to the shrinkage bias. Active contour algorithms are prone
to be trapped by regions of large gradient magnitude. Due to the
various shapes of SAT and VAT among subjects, the above-mentioned
situations are common in our task. In our algorithm, we focus on high-
level and more abstract features instead of raw features for segmenta-
tion. The third kind of algorithms [13] are proposed to use multi-way
features, such as the combination of intensity and spatial distance
information, which have been demonstrated to improve the accuracies
of segmentation. In our algorithm, we also consider multi-way features,
including intensity, spatial features and contextual features. Our spatial
features are represented by polar coordinates to better distinguish SAT
pixels and VAT pixels than Euclidean distance used in [13], which is
more suitable for AAT segmentation.
Recently, pixel-wise classification algorithms based on deep learn-
ing [14–19] have been successfully applied to the medical image
segmentation task. Usually, patches centered at each pixel are extracted
as inputs of deep neural architectures. The main problems for these
algorithms are lack of spatial consistency constraints. Two kinds of
algorithms have been provided to solve this issue: with multi-modal
inputs [16,17] and with complex post-processing [18,19]. It has been
demonstrated that multi-modal images can provide complementary
information for the classification, and [16,17] have been demonstrated
the eff
ectiveness of multi-modal inputs. However, in some cases, multi-
modal
images are hard to obtain. Another way is to use complex pose-
processing for explicitly delineating the structures of various tissues.
Conditional random field (CRF) and structured support volume
management (SSVM) [18,19] are two commonly used ones. But the
inferences for these two algorithms are complex. Furthermore, accu-
rate identifying the boundaries of tissues using shape prior is also
popular. Shape priors based on sparse representations [20–22] have
been successfully used in lung segmentation. However, this kind of
representation is not suitable in our case, for no consistent shapes of
VAT. Therefore, in our algorithm, we focus on further exploring the
distribution characteristics of tissues in abdomen using deep neural
network, and propose a simple binary classifier with the spatial
consistency for deciding the internal boundary of SAT.
3. Our approach
In this section, we propose our approach for AAT segmentation,
which is formulated as a two-stage coarse-to-fine algorithm. To
guarantee the segmentation accuracy on a large-scale population, we
not only propose a novel pixel-wise algorithm based on deep neural
network for discriminative and intrinsic representations of different
tissues, but also take into account the spatial distributions of tissues for
fine segmentation. In the first stage, coarse segmentation result is
obtained by softmax classifier based on high level features learned from
a new deep neural network. In the second stage, a more accurate
segmentation result is obtained based on the internal boundaries of
SAT which are decided by incorporating the coarse segmentation and
spatial information constraints.
To express more clearly, we introduce notations that will be used
throughout this paper. W and b represent weight and bias of deep
neural network. Specifically, W
ij
and b
ij
denote the parameters in the j-
th patch pathway that connect the i-th hidden layer, and W
i
and b
i
denote the weights connected to the i-th hidden layer, W
class
and
b
class
are the parameters that connect to the classifier layer, and the
parameters connected to the q-th class are denoted as
class q,
and
b
class q,
,
respectively. The features used for input include two-scale patches (P
1
and P
2
), intensity (I), and polar coordinates (radial coordinate γ and
polar angle θ). All these features with superscripts ℓ represents the
features of the ℓ-th pixel, such as
P
1
ℓ
represents the first patch extracted
from the ℓ-th pixel. The classification of the ℓ-th pixel is represented by
. We use θ
k
to denote the k-th section of angles. The inner boundary
and binary classifier of θ
k
are represented as E
k
and η
k
. G denotes the
nearest neighbor graph of polar angles.
3.1. Coarse segmentation
For AAT segmentation task, we care about the volumes of SAT and
VAT in abdominal images, which can be expressed by the amount of
pixels that belong to SAT and VAT. Therefore, we formulate the AAT
segmentation task as a pixel-wise classification problem, and all the
pixels of abdominal images are divided into three classes: SAT, VAT,
and non-AT (NAT).
Feature design is a key step for classification tasks. On T1-weighted
abdominal MR images, there are three characteristics: (i) adipose
tissues are usually brighter than other ones; (ii) location distributions
of SAT, VAT, NAT are quite different; (iii) the ellipse-like shapes of
SAT. In the following subsections, we present how to extract more
discriminative and intrinsic features from the above-mentioned char-
acteristics.
3.1.1. Feature selection
Feature I – Polar Map Location and shape features are essential
to distinguish different tissues in abdomen. To make these raw location
and shape features more discriminative, a polar transform has been
made, and polar coordinates are used to describe the location and
shape features in our algorithm. As we all know that the circle is
projected to a line under the polar transformation. Both the shapes of
external and internal boundaries of SAT are ellipse-like shapes, so if we
make a polar transform of the original abdominal image, SAT
boundaries will be much easier to recognize. Fig. 3 shows the polar
F. Jiang et al.
Neurocomputing xx (xxxx) xxxx–xxxx
3