2.5 Derivatives of Structured Matrices 2 DERIVATIVES
2.4.3 Higher Order
∂
∂X
Tr(X
k
) = k(X
k−1
)
T
∂
∂X
Tr(AX
k
) =
k−1
X
r=0
(X
r
AX
k−r−1
)
T
∂
∂X
Tr
B
T
X
T
CXX
T
CXB
= CXX
T
CXBB
T
+C
T
XBB
T
X
T
C
T
X
+CXBB
T
X
T
CX
+C
T
XX
T
C
T
XBB
T
2.4.4 Other
∂
∂X
Tr(AX
−1
B) = −(X
−1
BAX
−1
)
T
= −X
−T
A
T
B
T
X
−T
Assume B and C to be symmetric, then
∂
∂X
Tr
h
(X
T
CX)
−1
A
i
= −(CX(X
T
CX)
−1
)(A + A
T
)(X
T
CX)
−1
∂
∂X
Tr
h
(X
T
CX)
−1
(X
T
BX)
i
= −2CX(X
T
CX)
−1
X
T
BX(X
T
CX)
−1
+2BX(X
T
CX)
−1
See [7].
2.5 Derivatives of Structured Matrices
Assume that the matrix A has some structure, i.e. symmetric, toeplitz, etc.
In that case the derivatives of the previous section does not apply in general.
Instead, consider the following general rule for differentiating a scalar function
f(A)
df
dA
ij
=
X
kl
∂f
∂A
kl
∂A
kl
∂A
ij
= Tr
"
∂f
∂A
T
∂A
∂A
ij
#
The matrix differentiated with respect to itself is in this document referred to
as the structure matrix of A and is defined simply by
∂A
∂A
ij
= S
ij
If A has no special structure we have simply S
ij
= J
ij
, that is, the structure
matrix is simply the singleentry matrix. Many structures have a representation
in singleentry matrices, see Sec. 8.2.6 for more examples of structure matrices.
Petersen & Pedersen, The Matrix Cookbook, Version: February 16, 2006, Page 12