2780 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 7, JULY 2008
Fig. 1. Block diagram of the MMSE-DFE scheme.
matrix X. Now, alternatively, the nulling and interference
matrices can be found via the QR-decomposition [3]
H
√
ηI
= Q
1
R
1
, Q
1
=
Q
1u
Q
1d
, R
1
= Λ
1
B
1
,
(5)
where Q
1
has orthonormal columns, Q
1u
is N
R
× N
T
, Q
1d
is N
T
× N
T
, R
1
is N
T
× N
T
and upper triangular, Λ
1
=
diag(R
1
),andB
1
is upper triangular with unit diagonal. (Note
that Q
1u
and Q
1d
are not unitary.) Then, the nu lling and
interference matrices satisfy
W
H
= Λ
−1
1
Q
H
1u
and B = B
1
. (6)
The symbols are detected from ˆx
N
T
to ˆx
1
as follows:
for i = N
T
: −1:1
ˆx
i
= C
[W
H
y]
i
−
N
T
j=i+1
[B]
i,j
ˆx
j
end
where C denotes the mapping to the nearest signal point in
the constellation. Ignoring the effect of error propagation, the
MMSE-DFE scheme produces decoupled subchannels of the
form y
i
= r
i
x
i
+ u
i
where r
i
is the i-th diagonal element of
Λ
1
.In[6],itwasshownthat
η(1 + ρ
i
)=r
2
i
, (7)
where ρ
i
is the SINR of the i-th subchannel. Thus, the capacity
of the scheme can be written as
N
T
i=1
log(1 + ρ
i
)=
N
T
i=1
log
r
2
i
η
=logdet
I +
1
η
H
H
H
.
(8)
This gives another proof that the MMSE-DFE receiver is
information lossless [16], [6].
B. MMSE-based DPC
One major problem with DFEs is error propagation. If CSI
is known at the transmitter, interference between subchannels
can be cancelled completely before transmission via DPC.
Here, a general view of MMSE-based DPC via successive
interference pre-subtraction is developed. Consider once again
the N
T
×N
R
point-to-point channel y = Hx+u from Section
II-A. However, it will not be required that E[uu
H
]=N
0
I
but only that E[|u
i
|
2
]=N
0
for each i. Assume that there
is n o collaboration between the receive antennas. Writing
h
ij
=[H]
i,j
,thei-th subchannel is
y
i
=(
j<i
h
ij
x
j
)+h
ii
x
i
+(
j>i
h
ij
x
j
)+u
i
. (9)
Fig. 2. Block diagram of the MMSE-DPC scheme using THP.
The hope is to treat (
j<i
h
ij
x
j
) as interference terms to
be cancelled at the transmitter. If these in terference terms are
cancelled perfectly, then a single input single output (SISO)
MMSE receiver that sees (
j>i
h
ij
x
j
)+u
i
as noise terms
can be used on each subchannel. The corresponding MMSE
coefficient for the i-th subchannel is
d
i
=
h
∗
ii
η +
j≥i
|h
ij
|
2
, (10)
where x
∗
is the conjugate of a complex number x. Denoting
D
d
= diag(d
1
,...,d
N
R
), the equivalent channel is now
D
d
H. Thus, the interference terms can be represented by the
lower triangular unit-diagonal matrix B = L(D
d
H), called
the interference matrix. Meanwhile, the SINR of the i-th
subchannel is given by
ρ
i
=
|h
ii
|
2
η +
j>i
|h
ij
|
2
. (11)
A simple and useful relation between (10) and (11) can be
noted at this point. Let Σ
0
= η +
j>i
|h
ij
|
2
. Then, ρ
i
=
|h
ii
|
2
/Σ
0
and d
i
= h
∗
ii
/(Σ
0
+ |h
ii
|
2
). Eliminating Σ
0
gives
d
i
=
ρ
i
h
ii
(1 + ρ
i
)
. (12)
One low-complexity suboptimal implementation of DPC is
Tomlinson-Harashima precoding (THP) [14]. The block di-
agram of a MMSE-based DPC scheme using THP is shown
in Figure 2. The vector ˜x to be transmitted can be evaluated
from ˜x
1
to ˜x
N
T
using
˜x
1
= x
1
for i =2:1:N
T
˜x
i
=mod
x
i
−
i−1
j=1
[B]
i,j
˜x
j
end
A downside of THP is the slight increase in the average
transmit power by a factor of M/(M − 1) for M -QAM
symbols, called the precoding loss. For large constellations,
this loss is negligible.
C. ZF-DPC
Consider the MIMO broadcast channel described in (1). If
CSI is available at the transmitter, interferenc e can cellation via
dirty paper precoding can be performed. THP can be used as a
suboptimal implementation of DPC. Conventional precoding
schemes often treat multiple antennas of different users as
different virtual users. One example is the zero-forcing THP
(ZF-THP) scheme [14]. It is based on the QR decomposition
H
H
= QR,orH = R
H
Q
H
. The linear precoder Q is
applied before transmission so that x = Qs,wheres is the
vector of information symbols to be sent. Th is transforms the