深度学习中的反向传播数学计算过程

2024-01-07 17:27:32

====================================================================================分割线
假设:
X = (x1, x2, x3)
Y=2*X
= (2*x1, 2*x2, 2*x3) = (y1, y2, y3)
实际上有:
y1=f(x1,x2,x3)=2*x1
y2=f(x1,x2,x3)=2*x2
y3=f(x1,x2,x3)=2*x3
---------------------------------------------------------------------------------------------------------------------------------------------------分割线

1 计算关于X关于的雅可比矩阵

J = ( ? y 1 ? x 1 ? y 1 ? x 2 ? y 1 ? x 3 ? y 2 ? x 1 ? y 2 ? x 2 ? y 2 ? x 3 ? y 3 ? x 1 ? y 3 ? x 2 ? y 3 ? x 3 ) \begin{equation*} J= \begin{pmatrix} \dfrac{\partial y_1}{\partial x_1}&\dfrac{\partial y_1}{\partial x_2}&\dfrac{\partial y_1}{\partial x_3} \\[2.5ex] \dfrac{\partial y_2}{\partial x_1}&\dfrac{\partial y_2}{\partial x_2}&\dfrac{\partial y_2}{\partial x_3} \\[2.5ex] \dfrac{\partial y_3}{\partial x_1}&\dfrac{\partial y_3}{\partial x_2}&\dfrac{\partial y_3}{\partial x_3} \\[1.5ex] \end{pmatrix} \end{equation*} J= ??x1??y1???x1??y2???x1??y3????x2??y1???x2??y2???x2??y3????x3??y1???x3??y2???x3??y3??? ??

2 计算各分量的偏导和 / v投影各方向上的累加和

d Y d x 1 = ? y 1 ? x 1 + ? y 1 ? x 1 + ? y 1 ? x 1 d Y d x 2 = ? y 1 ? x 2 + ? y 2 ? x 2 + ? y 3 ? x 2 d Y d x 3 = ? y 1 ? x 3 + ? y 2 ? x 3 + ? y 3 ? x 3 \dfrac{dY}{dx_1}= \dfrac{\partial y_1}{\partial x_1}+\dfrac{\partial y_1}{\partial x_1}+\dfrac{\partial y_1}{\partial x_1}\\[2.5ex] \dfrac{dY}{dx_2}= \dfrac{\partial y_1}{\partial x_2}+\dfrac{\partial y_2}{\partial x_2}+\dfrac{\partial y_3}{\partial x_2}\\[2.5ex] \dfrac{dY}{dx_3}= \dfrac{\partial y_1}{\partial x_3}+\dfrac{\partial y_2}{\partial x_3}+\dfrac{\partial y_3}{\partial x_3}\\[2.5ex] dx1?dY?=?x1??y1??+?x1??y1??+?x1??y1??dx2?dY?=?x2??y1??+?x2??y2??+?x2??y3??dx3?dY?=?x3??y1??+?x3??y2??+?x3??y3??

3 确定最终分量的梯度计算表达式

d Y d X = ( d Y d x 1 d Y d x 2 d Y d x 3 ) \dfrac{dY}{dX}= \begin{pmatrix} \dfrac{dY}{dx_1}&\dfrac{dY}{dx_2}&\dfrac{dY}{dx_3} \\ \end{pmatrix} dXdY?=(dx1?dY??dx2?dY??dx3?dY??)

4 y.backward(v) 根据函数中有无参数v进行计算

若是v=(m,n,q),则偏导数计算过程中,偏导数前应该乘上分量对应投影值
比如,若v=(1,2,3),则在表示在偏导计算过程中,对应分量x1,x2,x3应该乘上对应的投影值
以?? d Y d x 1 = 1 ? ? y 1 ? x 1 + 2 ? ? y 1 ? x 1 + 3 ? ? y 1 ? x 1 ?为例 以 \ \ \dfrac{dY}{dx_1}= 1*\dfrac{\partial y_1}{\partial x_1}+2*\dfrac{\partial y_1}{\partial x_1}+3*\dfrac{\partial y_1}{\partial x_1} \ 为例 ??dx1?dY?=1??x1??y1??+2??x1??y1??+3??x1??y1???为例
更广泛的:
d Y j d x i = m ? ? y j ? x i + n ? ? y j ? x i + q ? ? y j ? x i \dfrac{dY_j}{dx_i}= m*\dfrac{\partial y_j}{\partial x_i}+n*\dfrac{\partial y_j}{\partial x_i}+q*\dfrac{\partial y_j}{\partial x_i} dxi?dYj??=m??xi??yj??+n??xi??yj??+q??xi??yj??

所以当v=(1,1,1)时,有无投影没区别

===================================

如果哪里有错误,请在评论区指出,虚心听取

文章来源:https://blog.csdn.net/qq_43401942/article/details/135372295
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。