novel algorithm can be found in Section 3. An extension of
designing projection matrix to the case with sparse
representation error is shown in Section 4. Detailed dis-
cussions about how this cost function is developed and
how this problem is efficiently solved can be also found in
Section 4.InSection 5, simulations are given to illustrate
the effectiveness and efficiency of the methods shown in
this paper. The conclusions are given in Section 6 to end
this paper.
2. Preliminaries and problem formulation
Following [20],wedefine the maximal mutual coherence
of the matrix D as
μ
mx
D
ðÞ
9 max
1 r i a j r L
jDð: ; iÞ
T
Dð: ; jÞj
J Dð: ; iÞJ
2
J Dð: ; jÞJ
2
: ð6Þ
Denote
S
c
9diagðGð1; 1Þ
1=2
⋯Gðk; kÞ
1=2
⋯GðL; LÞ
1=2
Þ
Then, the Gram matrix of
D ¼ DS
c
,denotedasG,isnor-
malized such that
Gðk; kÞ¼1; 8 k. Obviously, μ
mx
ðDÞ¼
max
i a j
jGði; jÞj. The lower and upper bounds of μ
mx
ðDÞ
are [23]
μ r μ
mx
ðDÞr 1
with
μ 9
ffiffiffiffiffiffiffiffiffiffiffiffiffi
L M
MðL 1Þ
q
. The av erag e mutual coherence used in
[20] is given by
μ
av
DðÞ9
P
8ði;jÞ A H
av
jGði; jÞj
N
av
ð7Þ
where N
av
is the number of entries in the index set
H
av
9fði; jÞ: jGði; jÞjZ ζ&iajg with 0r ζ o 1thatwecan
choose as desired. Both μ
av
and μ
mx
are used as the measures
of projection matrix design in this paper.
Frames are an important concept in signal analysis. We
refer the reader to [29,30] for the detailed discussions on
frames. Here, we only introduce two important classes of
frames: tight frames and ETFs, which are necessary for our
later analysis. The sequence of column vectors Dð: ; jÞ of
matrix D is a frame for the Hilbert space R
N
if there exist
two constants 0o α r β o þ1 such that
αJ v J
2
r J D
T
vJ
2
r β J v J
2
ð8Þ
for all v A R
N1
. Such a frame is called α-tight if α ¼ β in (8).
A normalized frame fDð: ; jÞg ði:e:; J Dð: ; jÞJ
2
¼ 1; 8 jÞ is
said to be equiangular if
jDð: ; iÞ
T
Dð: ; jÞj ¼ c; 8iaj
where c is a positive constant. As shown in [23], a matrix D
with J Dð: ; jÞJ
2
¼ 1; 8 j achieves μ
mx
ðDÞ¼μ if and only if
fDð: ; jÞg is ETF, moreover, μ
mx
ðDÞ¼μ can only hold if
Lr Mð M þ 1Þ=2. It is clear that the orthogonal bases form a
special class of ETFs. An ETF has a very nice average mutual
coherence behavior and has been used in optimal dic-
tionary design [24]. In addition, such an idea was also
extended to the optimal projection matrix design [25,26].
The projection matrix design problem based on the ETF
structure can be formulated as [25,26]
7
min
G
t
;Φ
J G
t
Ψ
T
Φ
T
ΦΨ J
2
F
G
t
A H
ζ
H
ζ
¼fG
t
jG
t
¼ G
T
t
; diagðG
t
Þ¼1; max
i a j
jG
t
ði; jÞj r ζgð9Þ
where ζ is a parameter to control the maximal non-
diagonal value in G
t
, diag ðG
t
Þ denotes the diagonal entries
of G
t
and 1 represents a vector of proper dimension with
its elements all equal to 1. It can be found that (9) is a
highly nonconvex problem. The alternating method has
been widely used to address such problems [23,24,26]. The
basic idea is to update Φ and G
t
alternatively. Firstly, the
projection matrix Φ is fixed and G
t
is updated. Secondly, G
t
is fixed and the projection matrix Φ is updated. This pro-
ceeding will be repeated until a stop criterion is reached.
At the first stage, we directly project the Gram matrix of
ΦΨ to the set H
ζ
. The key problem is the second step.
Abolghasemi et al. [26] proposed a gradient descent
method to address the problem in this step. For con-
venience, the procedures of solving (9) in [26] is denoted
as Algorithm 1 and shown as follows.
Algorithm 1. AFS.
Initialization:
Sparsify dictionary matrix Ψ, initialize Φ
0
to a random matrix, and
the number of iterations: Iter
outer
.
Output:
Projection matrix Φ.
Iteration:
1: Φ’Φ
0
2: for l ¼1 to Iter
outer
do
3: ðG’Ψ
T
Φ
T
ΦΨÞ
4: Update G
t
:
5: diagðG
t
Þ’1
6: for all iaj do
7: G
t
ði; jÞ’
Gði; jÞ if jGði; jÞjr ζ
ζ signðGði; jÞÞ otherwise
(
where sign ðÞ is a sign function.
8: end for
9: Update Φ:
10: Φ’
~
Φ ¼ arg min
Φ
J G
t
Ψ
T
Φ
T
ΦΨ J
2
F
ð10Þ
11: end for
Note that the gradient descent algorithm can only
reach a local minimum for (10). Moreover, as pointed out
by the authors of [26], the accuracy of the solution for (10)
will affect the final performance in terms of signal recov-
ery accuracy. Hence, in order to reach a good solution, a
large number of iterations is needed for the gradient
descent algorithm while the step size must be chosen
carefully. Meanwhile, the complexity of the algorithm is
increased. According to these observations, a new algo-
rithm is proposed in this paper which can yield a better
solution for (10) with a much lower complexity when the
dimension of the problem is not very large. In order to
carry out the proposed algorithm, some manipulations
should be taken to cast (10) to a processable equation. The
related manipulations are shown as follows.
7
In the case of G
t
¼ I, this problem is equivalent to (5).
T. Hong et al. / Signal Processing 125 (2016) 9–20 11