144 H. Liu et al. / Information Sciences 468 (2018) 142–154
Fig. 1. Architecture of SRCNN and the proposed SR-DCNN for SR image.
ages, but also in medical image applications, such as ROI segmentation [14] , lesion detection [44] and tomographic imaging
[35,40] , where deeper networks have achieved great success [28] . Although these types of networks require a large amount
of data and a correspondingly large training time, the test stage is relatively short and good results can be obtained.
In this paper, a single-channel CNN was applied to the SR reconstruction of grayscale medical images, which was im-
proved from the SRCNN network [2] . One big difference is that a deconvolution layer is adopted as the first layer in the
model we proposed instead of the bicubic interpolation in SRCNN, which was inspired by the success of the deconvolution
operation in image saliency detection [42] and semantic segmentation [25] . This network is therefore termed SR Deconvo-
lution CNN (SR-DCNN), which facilitates SR reconstruction by directly learning the mapping between LR and HR images. In
deconvolution, the filter parameters can be updated as the number of iterations increases. Under the evidence of laboratory
data, the four-layer lightweight network constructed in this paper achieved better results than SRCNN. In our network, the
bicubic interpolation was used to generate the network input as a preprocessing step, which provides an effective solution to
the inconsistent mapping problem of input image existed in SRCNN and sparse encoding algorithms. Instead of the rectified
linear unit (ReLU), the parametric rectified linear unit (PReLU) [9] was introduced as the activation function in each layer.
In this approach, the coefficient of the negative part was computed to avoid the problem of “dead characteristic” problem
caused by zero parameters of traditional ReLU.
The remainder of this paper is organized as follows. Section 2 describes the architecture of the proposed SR-DCNN in-
cluding related forward and back propagation processes. In Section 3 , the objective functions is defined, and the performance
of proposed method is evaluated and compared to the state-of-the-art methods. In addition, a description of the discussed
influence of the hyper parameters on achieving an optimal tradeoff between the network performance and computational
efficiency is presented. Finally, the conclusions drawn from this research are summarized in Section 4 .
2. Method
The architecture of the proposed SR-DCNN network is shown and compared with that of SRCNN in Fig. 1 . The classical
SRCNN employs bicubic interpolation to amplify the image, and the convolutional network consists of three-layers, each of
which is followed by a ReLU function. In contrast, the proposed SR-DCNN contains one deconvolutional layer and multiple
convolutional layers, and each layer is followed by a PReLU function. The workflow begins by down-sampling an LR medical
image from the original HR image to serve as the input image. The network construction details are as follows.
2.1. Forward propagation
The purpose of the proposed method is to learn a mapping F to recover the HR image from an LR image Y such that F ( Y )
is as similar as possible to the original HR image X . An example patch extraction diagram is shown in Fig. 2 . The workflow
is as follows.
A single HR image is down-sampled via bicubic interpolation. Note that this is the only preprocessing step applied. The
result of this step is an LR image with a size of 1/ d
2
of the original HR image. The original image is denoted as X , and the
down-sampled image is denoted as Y . The positional alignment between the LR image and the original HR image is obtained
via bicubic interpolation. The pixel value of the down-sampled image is calculated from the nearest 16 neighbor pixels in
the corresponding position of the original image.
Then, each LR medical image Y is divided into blocks having a size of n × n with a stride of s . Similarly, the corresponding
HR image X is divided into blocks having a size of m × m with a stride of 2 s, which is used as label of Y . Finally, the LR and
HR image blocks have a one to one correspondence.