Transformer-MM-Explainability

2024-01-07 18:06:58

在这里插入图片描述
在这里插入图片描述
two modalities are separated by the [SEP] token,the numbers in each attention module represent the Eq. number.
在这里插入图片描述
E h _h h? is the mean, ? \nabla ?A := ? y t ? A {?y_t}\over?A ?A?yt??for y t y_t yt? which is the model’s output. ⊙ \odot is the Hadamard product,remove the negative contributions before averaging.
在这里插入图片描述
aggregated self-attention matrix R q q ^{qq} qq,previous layers’ mixture of context is embodied by R q k ^{qk} qk.

感想

作者的实验在coco和ImageNet验证集上做的,不好follow

文章来源:https://blog.csdn.net/qq_46221910/article/details/135433320
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。