博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
【350】机器学习中的线性代数之矩阵求导
阅读量:6271 次
发布时间:2019-06-22

本文共 7411 字,大约阅读时间需要 24 分钟。

参考:

参考:

矩阵求导(Matrix Derivative)也称作矩阵微分(Matrix Differential),在机器学习、图像处理、最优化等领域的公式推导中经常用到。

布局(Layout):在矩阵求导中有两种布局,分别为分母布局(denominator layout)分子布局(numerator layout)。这两种不同布局的求导规则是不一样的。

个人理解:

Numerator Layout:布局按照分子的排列,例如分子的m列,那么结果的m列是对应分子的,与分母正好相反,分母如果为n列,对应的n行,比较常用。

Denominator Layout:与上面正好相反,结果正好是转置矩阵。

Numerator-layout notation

Using numerator-layout notation, we have:

$

{\displaystyle {\frac {\partial y}{\partial \mathbf {x} }}=\left[{\frac {\partial y}{\partial x_{1}}}{\frac {\partial y}{\partial x_{2}}}\cdots {\frac {\partial y}{\partial x_{n}}}\right].}
$
$
{\displaystyle {\frac {\partial \mathbf {y} }{\partial x}}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x}}\\{\frac {\partial y_{2}}{\partial x}}\\\vdots \\{\frac {\partial y_{m}}{\partial x}}\\\end{bmatrix}}.}
$
${\displaystyle {\frac {\partial \mathbf {y} }{\partial \mathbf {x} }}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x_{1}}}&{\frac {\partial y_{1}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{1}}{\partial x_{n}}}\\{\frac {\partial y_{2}}{\partial x_{1}}}&{\frac {\partial y_{2}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{2}}{\partial x_{n}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial y_{m}}{\partial x_{1}}}&{\frac {\partial y_{m}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{m}}{\partial x_{n}}}\\\end{bmatrix}}.}
$
$
\frac{\partial y}{\partial \mathbf{X}} = \begin{bmatrix} \frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{p1}}\\ \frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{p2}}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y}{\partial x_{1q}} & \frac{\partial y}{\partial x_{2q}} & \cdots & \frac{\partial y}{\partial x_{pq}}\\ \end{bmatrix}.
$

∂ y ∂ x = [ ∂ y ∂ x 1 ∂ y ∂ x 2 ⋯ ∂ y ∂ x n ] . {\displaystyle {\frac {\partial y}{\partial \mathbf {x} }}=\left[{\frac {\partial y}{\partial x_{1}}}{\frac {\partial y}{\partial x_{2}}}\cdots {\frac {\partial y}{\partial x_{n}}}\right].} \frac{\partial y}{\partial \mathbf{x}} = \left[ \frac{\partial y}{\partial x_1} \frac{\partial y}{\partial x_2} \cdots \frac{\partial y}{\partial x_n} \right].
∂ y ∂ x = [ ∂ y 1 ∂ x ∂ y 2 ∂ x ⋮ ∂ y m ∂ x ] . {\displaystyle {\frac {\partial \mathbf {y} }{\partial x}}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x}}\\{\frac {\partial y_{2}}{\partial x}}\\\vdots \\{\frac {\partial y_{m}}{\partial x}}\\\end{bmatrix}}.} \frac{\partial \mathbf{y}}{\partial x} = \begin{bmatrix} \frac{\partial y_1}{\partial x}\\ \frac{\partial y_2}{\partial x}\\ \vdots\\ \frac{\partial y_m}{\partial x}\\ \end{bmatrix}.
∂ y ∂ x = [ ∂ y 1 ∂ x 1 ∂ y 1 ∂ x 2 ⋯ ∂ y 1 ∂ x n ∂ y 2 ∂ x 1 ∂ y 2 ∂ x 2 ⋯ ∂ y 2 ∂ x n ⋮ ⋮ ⋱ ⋮ ∂ y m ∂ x 1 ∂ y m ∂ x 2 ⋯ ∂ y m ∂ x n ] . {\displaystyle {\frac {\partial \mathbf {y} }{\partial \mathbf {x} }}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x_{1}}}&{\frac {\partial y_{1}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{1}}{\partial x_{n}}}\\{\frac {\partial y_{2}}{\partial x_{1}}}&{\frac {\partial y_{2}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{2}}{\partial x_{n}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial y_{m}}{\partial x_{1}}}&{\frac {\partial y_{m}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{m}}{\partial x_{n}}}\\\end{bmatrix}}.} \frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n}\\ \frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} & \cdots & \frac{\partial y_2}{\partial x_n}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y_m}{\partial x_1} & \frac{\partial y_m}{\partial x_2} & \cdots & \frac{\partial y_m}{\partial x_n}\\ \end{bmatrix}.
∂ y ∂ X = [ ∂ y ∂ x 11 ∂ y ∂ x 21 ⋯ ∂ y ∂ x p 1 ∂ y ∂ x 12 ∂ y ∂ x 22 ⋯ ∂ y ∂ x p 2 ⋮ ⋮ ⋱ ⋮ ∂ y ∂ x 1 q ∂ y ∂ x 2 q ⋯ ∂ y ∂ x p q ] . {\displaystyle {\frac {\partial y}{\partial \mathbf {X} }}={\begin{bmatrix}{\frac {\partial y}{\partial x_{11}}}&{\frac {\partial y}{\partial x_{21}}}&\cdots &{\frac {\partial y}{\partial x_{p1}}}\\{\frac {\partial y}{\partial x_{12}}}&{\frac {\partial y}{\partial x_{22}}}&\cdots &{\frac {\partial y}{\partial x_{p2}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial y}{\partial x_{1q}}}&{\frac {\partial y}{\partial x_{2q}}}&\cdots &{\frac {\partial y}{\partial x_{pq}}}\\\end{bmatrix}}.} \frac{\partial y}{\partial \mathbf{X}} = \begin{bmatrix} \frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{p1}}\\ \frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{p2}}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y}{\partial x_{1q}} & \frac{\partial y}{\partial x_{2q}} & \cdots & \frac{\partial y}{\partial x_{pq}}\\ \end{bmatrix}.

The following definitions are only provided in numerator-layout notation:

$

\frac{\partial \mathbf{Y}}{\partial x} = \begin{bmatrix} \frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1n}}{\partial x}\\ \frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2n}}{\partial x}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{mn}}{\partial x}\\ \end{bmatrix}.
$
$
d\mathbf{X} = \begin{bmatrix} dx_{11} & dx_{12} & \cdots & dx_{1n}\\ dx_{21} & dx_{22} & \cdots & dx_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ dx_{m1} & dx_{m2} & \cdots & dx_{mn}\\ \end{bmatrix}.
$

代码参考:

$${\displaystyle {\frac {\partial y}{\partial \mathbf {x} }}=\left[{\frac {\partial y}{\partial x_{1}}}{\frac {\partial y}{\partial x_{2}}}\cdots {\frac {\partial y}{\partial x_{n}}}\right].} $$$${\displaystyle {\frac {\partial \mathbf {y} }{\partial x}}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x}}\\{\frac {\partial y_{2}}{\partial x}}\\\vdots \\{\frac {\partial y_{m}}{\partial x}}\\\end{bmatrix}}.} $$$${\displaystyle {\frac {\partial \mathbf {y} }{\partial \mathbf {x} }}={\begin{bmatrix}{\frac {\partial y_{1}}{\partial x_{1}}}&{\frac {\partial y_{1}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{1}}{\partial x_{n}}}\\{\frac {\partial y_{2}}{\partial x_{1}}}&{\frac {\partial y_{2}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{2}}{\partial x_{n}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial y_{m}}{\partial x_{1}}}&{\frac {\partial y_{m}}{\partial x_{2}}}&\cdots &{\frac {\partial y_{m}}{\partial x_{n}}}\\\end{bmatrix}}.} $$$$\frac{\partial y}{\partial \mathbf{X}} = \begin{bmatrix} \frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{p1}}\\ \frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{p2}}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y}{\partial x_{1q}} & \frac{\partial y}{\partial x_{2q}} & \cdots & \frac{\partial y}{\partial x_{pq}}\\ \end{bmatrix}. $$$$\frac{\partial \mathbf{Y}}{\partial x} = \begin{bmatrix} \frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1n}}{\partial x}\\ \frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2n}}{\partial x}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{mn}}{\partial x}\\ \end{bmatrix}. $$$$d\mathbf{X} = \begin{bmatrix} dx_{11} & dx_{12} & \cdots & dx_{1n}\\ dx_{21} & dx_{22} & \cdots & dx_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ dx_{m1} & dx_{m2} & \cdots & dx_{mn}\\ \end{bmatrix}. $$

 

∂ Y ∂ x = [ ∂ y 11 ∂ x ∂ y 12 ∂ x ⋯ ∂ y 1 n ∂ x ∂ y 21 ∂ x ∂ y 22 ∂ x ⋯ ∂ y 2 n ∂ x ⋮ ⋮ ⋱ ⋮ ∂ y m 1 ∂ x ∂ y m 2 ∂ x ⋯ ∂ y m n ∂ x ] . {\displaystyle {\frac {\partial \mathbf {Y} }{\partial x}}={\begin{bmatrix}{\frac {\partial y_{11}}{\partial x}}&{\frac {\partial y_{12}}{\partial x}}&\cdots &{\frac {\partial y_{1n}}{\partial x}}\\{\frac {\partial y_{21}}{\partial x}}&{\frac {\partial y_{22}}{\partial x}}&\cdots &{\frac {\partial y_{2n}}{\partial x}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial y_{m1}}{\partial x}}&{\frac {\partial y_{m2}}{\partial x}}&\cdots &{\frac {\partial y_{mn}}{\partial x}}\\\end{bmatrix}}.} \frac{\partial \mathbf{Y}}{\partial x} = \begin{bmatrix} \frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1n}}{\partial x}\\ \frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2n}}{\partial x}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{mn}}{\partial x}\\ \end{bmatrix}.
d X = [ d x 11 d x 12 ⋯ d x 1 n d x 21 d x 22 ⋯ d x 2 n ⋮ ⋮ ⋱ ⋮ d x m 1 d x m 2 ⋯ d x m n ] . {\displaystyle d\mathbf {X} ={\begin{bmatrix}dx_{11}&dx_{12}&\cdots &dx_{1n}\\dx_{21}&dx_{22}&\cdots &dx_{2n}\\\vdots &\vdots &\ddots &\vdots \\dx_{m1}&dx_{m2}&\cdots &dx_{mn}\\\end{bmatrix}}.} d\mathbf{X} = \begin{bmatrix} dx_{11} & dx_{12} & \cdots & dx_{1n}\\ dx_{21} & dx_{22} & \cdots & dx_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ dx_{m1} & dx_{m2} & \cdots & dx_{mn}\\ \end{bmatrix}.

转载于:https://www.cnblogs.com/alex-bn-lee/p/10292729.html

你可能感兴趣的文章
sharepoint 2013 补丁升级步骤
查看>>
asp.net core 2.0 web api基于JWT自定义策略授权
查看>>
Skype for Business Server 2015-04-前端服务器-3-安装-管理工具
查看>>
第12章代码《跟老男孩学习Linux运维:Shell编程实战》
查看>>
我们为什么从Python转到go?
查看>>
5.Azure负载均衡(上)
查看>>
轻松精通awk数组企业问题案例
查看>>
26.Azure备份服务器(下)
查看>>
从“网上说的能信么”说开去---学习的思考
查看>>
DHCP 日志分析
查看>>
.NET Micro Framework动态调用C/C++底层代码(原理篇)
查看>>
Windows Server 2012正式版RDS系列⒃
查看>>
Shell脚本之awk篇
查看>>
微软发布Azure Stack硬件需求
查看>>
python socket编程详细介绍
查看>>
Windows Server 2016第三个技术预览版新技术
查看>>
Everything 本地磁盘文件搜索工具下载!
查看>>
Python dict(字典) 详细总结
查看>>
RPF(Reverse Path Forwarding 反向路径转发)技术
查看>>
2016年收到的第一件礼物,被评上微软全球最有价值专家MVP(一)
查看>>