没有合适的资源?快使用搜索试试~ 我知道了~
首页《矩阵烹饪书》:矩阵知识宝典
"The Matrix Cookbook.pdf" 是一本关于矩阵知识的工具书,收集了大量与矩阵相关的事实、身份、近似值、不等式和关系等内容,适用于快速查询,不适合作为系统学习的教材。 该书的目的是为了方便那些需要快速查阅矩阵相关知识的读者。书中所包含的矩阵公式和性质大多来源于网络上的笔记和书籍附录,作者并未声称这些内容为原创,而是进行了整理和汇集。同时,书中明确指出可能存在错误、打字错误或疏漏,并鼓励读者发现后通过指定邮箱 cookbook@2302.dk 提供更正。 这本书是一个持续更新的项目,版本会随着日期的更新而变化。作者欢迎读者通过电子邮件提供额外内容的建议或对某些主题进行深入探讨的提议。关键词包括矩阵代数、矩阵关系、矩阵恒等式、行列式导数、逆矩阵导数以及矩阵的微分。 书中涵盖了广泛的主题,例如矩阵的基本运算、特殊类型的矩阵(如对角矩阵、单位矩阵、正交矩阵等)、矩阵的性质(如秩、迹、行列式、特征值和特征向量)、矩阵函数的导数和微分、以及与线性变换和微分方程相关的矩阵理论。此外,还涉及到了矩阵的近似计算和不等式,这对于解决实际问题,如在数值分析、信号处理、机器学习等领域具有重要的应用价值。 这本书的作者名单包括 Kaare Brandt Petersen 和 Michael Syskind Pedersen,他们感谢了多位贡献者和建议者,以及资助其博士研究的 Oticon 基金会。《矩阵食谱》是数学和工程领域研究人员、学生和专业工作者不可或缺的参考资料,它提供了丰富的矩阵理论和实用技巧。
资源详情
资源推荐
2.5 Derivatives of Traces 2 DERIVATIVES
∂
∂X
a
T
(X
n
)
T
X
n
b =
n−1
X
r=0
h
X
n−1−r
ab
T
(X
n
)
T
X
r
+(X
r
)
T
X
n
ab
T
(X
n−1−r
)
T
i
(92)
See B.1.3 for a proof.
Assume s and r are functions of x, i.e. s = s(x), r = r(x), and that A is a
constant, then
∂
∂x
s
T
Ar =
∂s
∂x
T
Ar +
∂r
∂x
T
A
T
s (93)
∂
∂x
(Ax)
T
(Ax)
(Bx)
T
(Bx)
=
∂
∂x
x
T
A
T
Ax
x
T
B
T
Bx
(94)
= 2
A
T
Ax
x
T
BBx
− 2
x
T
A
T
AxB
T
Bx
(x
T
B
T
Bx)
2
(95)
2.4.4 Gradient and Hessian
Using the above we have for the gradient and the Hessian
f = x
T
Ax + b
T
x (96)
∇
x
f =
∂f
∂x
= (A + A
T
)x + b (97)
∂
2
f
∂x∂x
T
= A + A
T
(98)
2.5 Derivatives of Traces
Assume F (X) to be a differentiable function of each of the elements of X. It
then holds that
∂Tr(F (X))
∂X
= f(X)
T
where f (·) is the scalar derivative of F (·).
2.5.1 First Order
∂
∂X
Tr(X) = I (99)
∂
∂X
Tr(XA) = A
T
(100)
∂
∂X
Tr(AXB) = A
T
B
T
(101)
∂
∂X
Tr(AX
T
B) = BA (102)
∂
∂X
Tr(X
T
A) = A (103)
∂
∂X
Tr(AX
T
) = A (104)
∂
∂X
Tr(A ⊗ X) = Tr(A)I (105)
Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 12
2.5 Derivatives of Traces 2 DERIVATIVES
2.5.2 Second Order
∂
∂X
Tr(X
2
) = 2X
T
(106)
∂
∂X
Tr(X
2
B) = (XB + BX)
T
(107)
∂
∂X
Tr(X
T
BX) = BX + B
T
X (108)
∂
∂X
Tr(BXX
T
) = BX + B
T
X (109)
∂
∂X
Tr(XX
T
B) = BX + B
T
X (110)
∂
∂X
Tr(XBX
T
) = XB
T
+ XB (111)
∂
∂X
Tr(BX
T
X) = XB
T
+ XB (112)
∂
∂X
Tr(X
T
XB) = XB
T
+ XB (113)
∂
∂X
Tr(AXBX) = A
T
X
T
B
T
+ B
T
X
T
A
T
(114)
∂
∂X
Tr(X
T
X) =
∂
∂X
Tr(XX
T
) = 2X (115)
∂
∂X
Tr(B
T
X
T
CXB) = C
T
XBB
T
+ CXBB
T
(116)
∂
∂X
Tr
X
T
BXC
= BXC + B
T
XC
T
(117)
∂
∂X
Tr(AXBX
T
C) = A
T
C
T
XB
T
+ CAXB (118)
∂
∂X
Tr
h
(AXB + C)(AXB + C)
T
i
= 2A
T
(AXB + C)B
T
(119)
∂
∂X
Tr(X ⊗ X) =
∂
∂X
Tr(X)Tr(X) = 2Tr(X)I(120)
See [7].
2.5.3 Higher Order
∂
∂X
Tr(X
k
) = k(X
k−1
)
T
(121)
∂
∂X
Tr(AX
k
) =
k−1
X
r=0
(X
r
AX
k−r−1
)
T
(122)
∂
∂X
Tr
B
T
X
T
CXX
T
CXB
= CXX
T
CXBB
T
+C
T
XBB
T
X
T
C
T
X
+CXBB
T
X
T
CX
+C
T
XX
T
C
T
XBB
T
(123)
Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 13
2.6 Derivatives of vector norms 2 DERIVATIVES
2.5.4 Other
∂
∂X
Tr(AX
−1
B) = −(X
−1
BAX
−1
)
T
= −X
−T
A
T
B
T
X
−T
(124)
Assume B and C to be symmetric, then
∂
∂X
Tr
h
(X
T
CX)
−1
A
i
= −(CX(X
T
CX)
−1
)(A + A
T
)(X
T
CX)
−1
(125)
∂
∂X
Tr
h
(X
T
CX)
−1
(X
T
BX)
i
= −2CX(X
T
CX)
−1
X
T
BX(X
T
CX)
−1
+2BX(X
T
CX)
−1
(126)
∂
∂X
Tr
h
(A + X
T
CX)
−1
(X
T
BX)
i
= −2CX(A + X
T
CX)
−1
X
T
BX(A + X
T
CX)
−1
+2BX(A + X
T
CX)
−1
(127)
See [7].
∂Tr(sin(X))
∂X
= cos(X)
T
(128)
2.6 Derivatives of vector norms
2.6.1 Two-norm
∂
∂x
||x − a||
2
=
x − a
||x − a||
2
(129)
∂
∂x
x − a
kx − ak
2
=
I
kx − ak
2
−
(x − a)(x −a)
T
kx − ak
3
2
(130)
∂||x||
2
2
∂x
=
∂||x
T
x||
2
∂x
= 2x (131)
2.7 Derivatives of matrix norms
For more on matrix norms, see Sec. 10.4.
2.7.1 Frobenius norm
∂
∂X
||X||
2
F
=
∂
∂X
Tr(XX
H
) = 2X (132)
See (248). Note that this is also a special case of the result in equation 119.
2.8 Derivatives of Structured Matrices
Assume that the matrix A has some structure, i.e. symmetric, toeplitz, etc.
In that case the derivatives of the previous section does not apply in general.
Instead, consider the following general rule for differentiating a scalar function
f(A)
df
dA
ij
=
X
kl
∂f
∂A
kl
∂A
kl
∂A
ij
= Tr
"
∂f
∂A
T
∂A
∂A
ij
#
(133)
Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 14
2.8 Derivatives of Structured Matrices 2 DERIVATIVES
The matrix differentiated with respect to itself is in this document referred to
as the structure matrix of A and is defined simply by
∂A
∂A
ij
= S
ij
(134)
If A has no special structure we have simply S
ij
= J
ij
, that is, the structure
matrix is simply the single-entry matrix. Many structures have a representation
in singleentry matrices, see Sec. 9.7.6 for more examples of structure matrices.
2.8.1 The Chain Rule
Sometimes the objective is to find the derivative of a matrix which is a function
of another matrix. Let U = f (X), the goal is to find the derivative of the
function g(U) with respect to X:
∂g(U)
∂X
=
∂g(f(X))
∂X
(135)
Then the Chain Rule can then be written the following way:
∂g(U)
∂X
=
∂g(U)
∂x
ij
=
M
X
k=1
N
X
l=1
∂g(U)
∂u
kl
∂u
kl
∂x
ij
(136)
Using matrix notation, this can be written as:
∂g(U)
∂X
ij
= Tr
h
(
∂g(U)
∂U
)
T
∂U
∂X
ij
i
. (137)
2.8.2 Symmetric
If A is symmetric, then S
ij
= J
ij
+ J
ji
− J
ij
J
ij
and therefore
df
dA
=
∂f
∂A
+
∂f
∂A
T
− diag
∂f
∂A
(138)
That is, e.g., ([5]):
∂Tr(AX)
∂X
= A + A
T
− (A ◦ I), see (142) (139)
∂ det(X)
∂X
= det(X)(2X
−1
− (X
−1
◦ I)) (140)
∂ ln det(X)
∂X
= 2X
−1
− (X
−1
◦ I) (141)
2.8.3 Diagonal
If X is diagonal, then ([19]):
∂Tr(AX)
∂X
= A ◦ I (142)
Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 15
剩余71页未读,继续阅读
空荡心痕
- 粉丝: 0
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- zlib-1.2.12压缩包解析与技术要点
- 微信小程序滑动选项卡源码模版发布
- Unity虚拟人物唇同步插件Oculus Lipsync介绍
- Nginx 1.18.0版本WinSW自动安装与管理指南
- Java Swing和JDBC实现的ATM系统源码解析
- 掌握Spark Streaming与Maven集成的分布式大数据处理
- 深入学习推荐系统:教程、案例与项目实践
- Web开发者必备的取色工具软件介绍
- C语言实现李春葆数据结构实验程序
- 超市管理系统开发:asp+SQL Server 2005实战
- Redis伪集群搭建教程与实践
- 掌握网络活动细节:Wireshark v3.6.3网络嗅探工具详解
- 全面掌握美赛:建模、分析与编程实现教程
- Java图书馆系统完整项目源码及SQL文件解析
- PCtoLCD2002软件:高效图片和字符取模转换
- Java开发的体育赛事在线购票系统源码分析
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功