GPU accelerated simplified harmonic spherical
approximation equations for three-dimensional
optical imaging
Shenghan Ren (任胜寒), Xueli Chen (陈雪利), Xu Cao (曹 旭),
Shouping Zhu (朱守平), and Jimin Liang (梁继民)*
Engineering Research Center of Molecular and Neuro Imaging of Ministry of Education & School
of Life Science and Technology, Xidian University, Xi’an 710071, China
*Corresponding author: jimleung@mail.xidian.edu.cn
Received March 3, 2016; accepted April 22, 2016; posted online May 1, 2016
Simplified spherical harmonics approximation (SPN) equations are widely used in modeling light propagation in
biological tissues. However, with the increase of order N , its computational burden will severely aggravate. We
propose a graphics processing unit (GPU) accelerated framework for SPN equations. Compared with the conven-
tional central processing unit implementation, an increased performance of the GPU framework is obtained
with an increase in mesh size, with the best speed-up ratio of 25 among the studied cases. The
influence of thread distribution on the performance of the GPU framework is also investigated.
OCIS codes: 170.3660, 170.7050, 200.4960.
doi: 10.3788/COL201614.071701.
Three-dimensional (3D) optical imaging technology has
been widely used in the field of biomedical imaging because
of its significant advantages of noninvasive detection, high
temporal resolution, and low cost
[1–3]
. The major applica-
tion of 3D optical imaging is to study the propagation of
light in biological tissues. This can be accurately described
by the radiative transport equation (RTE) and the Monte
Carlo (MC) method
[4–6]
. However, the RTE is difficult to
solve analytically, especially for complex structures. The
MC method is time consuming, as it requires a large num-
ber of photons for reliable simulation results
[7]
. To facilitate
the development of 3D optical imaging, the diffusion equa-
tion (DE)
[8,9]
, the first order of the simplified spherical har-
monics approximation (SPN) of the RTE, has been widely
used in modeling the light propagation in biological tissues
because of its high computational efficiency. With the
assumption of light propagating diffusively, the DE is
not accurate when the observed regions are high absorption
or low scattering tissues
[10]
. In the past decades, the
SPN equations, the high order approximations for the
RTE, have been studied by many groups
[11–13]
. Compared
to DE, these equations provide much more accurate
and reliable results for tissues with a larger variation of op-
tical parameters, especially for high absorption and low
scattering regions. The accuracy of the SPN equations
for describin g light propagation increases with its order
N. However, with the increase of the order N, the computa-
tional burden would severely aggravate, which limits the
applications of high order SPN equations in 3D optical
imaging.
To reduce the computational burden of SPN equations,
Li et al. proposed an extended finite element method
(FEM) for acceleration
[14]
.Luet al. proposed a parallel
adaptive mesh evolution strategy to improve both the
modeling precision and simulation speed
[12]
. However, they
were implemented on a central processing unit (CPU)
based on moderately parallel systems. Compared with
the parallel system based on multi-core CPUs or multi-
core clusters, the graphics processing unit’s (GPU) based
acceleration technique has a much higher raw processing
power as well as a relative ly lower cost. With the rapid
development of the GPU hardware, the GPU-based
acceleration technique has been applied to accelerate
the MC simulations
[15]
, making it a powerful tool for the
simulation of light propagation in tissues. Taking the
advantages of the GPU technique, a GPU-accelerated fi-
nite element solver for the DE was implemented
[16]
, which
was used for calculating the forward model of diffusion op-
tical tomography. However, because of the inherent char-
acteristics of the DE, the acceleration performance was
not significant, and the applicability was limited, espe-
cially when the tissues are of high absorption or low
scattering.
In this Letter, to facilitate the application of the SPN
equations to be more efficient, a GPU-accelerated frame-
work is proposed for the SPN equations (referred hereafter
as the GPU-based method). The accelerated frame work is
implemented by using a compute unified device architec-
ture (CUDA)
[17]
. In this framework, the SPN equations are
solved with the FEM whose matrices are stored in a com-
pressed sparse row (CSR) format that makes good use of
the computing potential of the GPU hardware. The solu-
tion of the SPN equations is converted into the problem of
a sparse linear system, which is then solved by the parallel
conjugate gradient (CG) algorithm implemen ted on a
GPU kernel.
The accuracy of the GPU-based method was first evalu-
ated by comparing it with the MC simulation. Then, the
COL 14(7), 071701(2016) CHINESE OPTICS LETTERS July 10, 2016
1671-7694/2016/071701(5) 071701-1 © 2016 Chinese Optics Letters