Expose Attention Weights from MultiHeadDotProductAttention #3496
-
Hi, I am wondering whether it is possible to support returning attention weights computed by I understand that there is a function The ways I am thinking about to obtain the attention weights is to let I am happy to create a PR if the above sounds reasonable to you. Besides the available option in Arguably, the attention weights are one of the most important features that are worth investigating. They are quite useful in my personal experiences across various projects. Any suggestions would be much appreciated. Thanks a lot in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This feature has been added in #3529. Pass |
Beta Was this translation helpful? Give feedback.
This feature has been added in #3529. Pass
return_weights=True
toMultiHeadDotProductAttention
's__call__
method to sow the attention weights into theintermediates
collection.