深度学习中的反向传播数学计算过程
反向传播的数学计算过程
====================================================================================分割线
假设:
X = (x1, x2, x3)
Y=2*X = (2*x1, 2*x2, 2*x3) = (y1, y2, y3)
实际上有:
y1=f(x1,x2,x3)=2*x1
y2=f(x1,x2,x3)=2*x2
y3=f(x1,x2,x3)=2*x3
---------------------------------------------------------------------------------------------------------------------------------------------------分割线
1 计算关于X关于的雅可比矩阵
J = ( ? y 1 ? x 1 ? y 1 ? x 2 ? y 1 ? x 3 ? y 2 ? x 1 ? y 2 ? x 2 ? y 2 ? x 3 ? y 3 ? x 1 ? y 3 ? x 2 ? y 3 ? x 3 ) \begin{equation*} J= \begin{pmatrix} \dfrac{\partial y_1}{\partial x_1}&\dfrac{\partial y_1}{\partial x_2}&\dfrac{\partial y_1}{\partial x_3} \\[2.5ex] \dfrac{\partial y_2}{\partial x_1}&\dfrac{\partial y_2}{\partial x_2}&\dfrac{\partial y_2}{\partial x_3} \\[2.5ex] \dfrac{\partial y_3}{\partial x_1}&\dfrac{\partial y_3}{\partial x_2}&\dfrac{\partial y_3}{\partial x_3} \\[1.5ex] \end{pmatrix} \end{equation*} J= ??x1??y1???x1??y2???x1??y3????x2??y1???x2??y2???x2??y3????x3??y1???x3??y2???x3??y3??? ??
2 计算各分量的偏导和 / v投影各方向上的累加和
d Y d x 1 = ? y 1 ? x 1 + ? y 1 ? x 1 + ? y 1 ? x 1 d Y d x 2 = ? y 1 ? x 2 + ? y 2 ? x 2 + ? y 3 ? x 2 d Y d x 3 = ? y 1 ? x 3 + ? y 2 ? x 3 + ? y 3 ? x 3 \dfrac{dY}{dx_1}= \dfrac{\partial y_1}{\partial x_1}+\dfrac{\partial y_1}{\partial x_1}+\dfrac{\partial y_1}{\partial x_1}\\[2.5ex] \dfrac{dY}{dx_2}= \dfrac{\partial y_1}{\partial x_2}+\dfrac{\partial y_2}{\partial x_2}+\dfrac{\partial y_3}{\partial x_2}\\[2.5ex] \dfrac{dY}{dx_3}= \dfrac{\partial y_1}{\partial x_3}+\dfrac{\partial y_2}{\partial x_3}+\dfrac{\partial y_3}{\partial x_3}\\[2.5ex] dx1?dY?=?x1??y1??+?x1??y1??+?x1??y1??dx2?dY?=?x2??y1??+?x2??y2??+?x2??y3??dx3?dY?=?x3??y1??+?x3??y2??+?x3??y3??
3 确定最终分量的梯度计算表达式
d Y d X = ( d Y d x 1 d Y d x 2 d Y d x 3 ) \dfrac{dY}{dX}= \begin{pmatrix} \dfrac{dY}{dx_1}&\dfrac{dY}{dx_2}&\dfrac{dY}{dx_3} \\ \end{pmatrix} dXdY?=(dx1?dY??dx2?dY??dx3?dY??)
4 y.backward(v) 根据函数中有无参数v进行计算
若是v=(m,n,q),则偏导数计算过程中,偏导数前应该乘上分量对应投影值
比如,若v=(1,2,3),则在表示在偏导计算过程中,对应分量x1,x2,x3应该乘上对应的投影值
以??
d
Y
d
x
1
=
1
?
?
y
1
?
x
1
+
2
?
?
y
1
?
x
1
+
3
?
?
y
1
?
x
1
?为例
以 \ \ \dfrac{dY}{dx_1}= 1*\dfrac{\partial y_1}{\partial x_1}+2*\dfrac{\partial y_1}{\partial x_1}+3*\dfrac{\partial y_1}{\partial x_1} \ 为例
以??dx1?dY?=1??x1??y1??+2??x1??y1??+3??x1??y1???为例
更广泛的:
d
Y
j
d
x
i
=
m
?
?
y
j
?
x
i
+
n
?
?
y
j
?
x
i
+
q
?
?
y
j
?
x
i
\dfrac{dY_j}{dx_i}= m*\dfrac{\partial y_j}{\partial x_i}+n*\dfrac{\partial y_j}{\partial x_i}+q*\dfrac{\partial y_j}{\partial x_i}
dxi?dYj??=m??xi??yj??+n??xi??yj??+q??xi??yj??
所以当v=(1,1,1)时,有无投影没区别
===================================
如果哪里有错误,请在评论区指出,虚心听取
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!