Research on Implementation and Performance Analysis of Linear Least
Squares Based on GPU
1
Cheng Kefei,
1,2
Sun Yanwei
1
College of Computer Science and Technology,
Chongqing University of Posts and Telecommunications,
Chengkefei @sina.com
*2
School of Computer Science,
Hubei University of Education,
sunyanwei_wuhan@sina.com
Abstract
As a kind of mathematical optimization technology, method of least squares is widely used in
scientific experiments and engineering practices. In the text, the algorithm of dense least squares
linear equations on large scale is realized based on GPU and tested as well as analyzed. It is unlimited
to the dimension of matrix equation and a series of parallel acceleration technologies are adopted in
view of GPU. The result shows that the algorithm can make full use of the hardware characteristics of
GPU and effectively lessen the time of solving dense least squares linear equations on large scale.
Keywords: Method of Least Square, GPU, Dense Linear Equations on Large Scale, Parallel
Acceleration
1. Introduction
Least square is a quite old topic. In 1795, Gauss brought the topic in his predictive work of star orbit.
Later, method of least square became the basis of estimation theory [1]. In recent years, with the
dissemination and development of electronic computer, it is not only endowed with new contents on
part of mathematics but also applied to many research areas. For example, in the field of neural
network, chemistry, physics, finance, economy, mechanical system, electrical and electronic
engineering, medical imaging, the method of least square is often used to describe the difference
between model and actual observation data to find a group parameter that minimize errors.
In recent years, with the great improvement of the function of graphics processing units, General
calculation based on GPU gradually became a new hotspot. Yang Mei utilized GPU in the solution of
dense linear equations on large scale, which increased the computer speed by near 7 times compared to
CPU [2]. Li Yangbo applied it to the calculation of molecular dynamics LAMMPS, which increased by
more than 20 times compared to the computer speed based on CPU [3]. Chang Jian realized K-Means
algorithm by using it, which increased the speedup of GPU by about 40 times [4]. Zhang Nan utilized
it in dealing with image by Holomorphic filtering, which increased the speedup by 49 times compared
to that of CPU [5]. Yan Binbin applied it to the realization of the speedup ratio of vision system, which
increased the speedup by 3000 times [6].
In this study, Algorithm of solution of dense linear equations on a large scale and least squares is
realized and the performance of the algorithm is analyzed. The algorithm adopts a series of techniques
in the view of speedup ratio and adds the time of delivering data into the calculation time of GPU
comparing to the algorithm of speedup CPU in the same scale, which is actually more valuable to the
computation performance of GPU and CPU.
2. The linear least square method
In a scientific experiment or a statistical research, several groups of measured data are often
needed: { ( a
i1
,a
i2
,…,a
i(n-1)
,b
i
) | i=1,2, …, s}, in which a
ij
is the ith observation from the
variable u
j
and b
i
is the ith observation from the variable v. Thus an approximate formula can
be established:
V = k
1
u
1
+k
2
u
2
+,…,k
n-1
u
n-1
+k
n
Research on Implementation and Performance Analysis of Linear Least Squares Based on GPU
Cheng Kefei, Sun Yanwei
Journal of Convergence Information Technology(JCIT)
Volume8, Number11, June 2013
doi:10.4156/jcit.vol8.issue11.70