Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding Attention Heatmap #5

Open
rbsohee opened this issue Jun 3, 2024 · 1 comment
Open

Regarding Attention Heatmap #5

rbsohee opened this issue Jun 3, 2024 · 1 comment

Comments

@rbsohee
Copy link

rbsohee commented Jun 3, 2024

Hi, thank you for an interesting work :)

I was wondering how the "attention heatmap" in the paper was drawn.
If I have understood your method correctly, the learnable parameters are only added to the "Video Q-former", which cross-attends with 32 x T queries generated from frozen "Visual Q-former". The 32 visual queries attend to different regions from the frame, but as they are freezed the attention would not have changed.

It would really help if you could share the code/method you used to visualize the attention map.

@rxtan2
Copy link
Owner

rxtan2 commented Jun 11, 2024

Hi rbsohee, thank you very much for your interest in our work! I apologize for the delay due to some deadlines. We use a simplified method similar to attention rollout to extract the attention weights from the Video Q-former. The 32 visual queries are frozen. However, we append the learnable queries which interact with the visual queries through the self-attention layers. This causes the representations of the queries to change, which also affects the attention weights. Additionally, due to the complexity of the model, we used a simplified version before and are now evaluating new ways to extract such attention maps. We are working on cleaning up the script and code component to extract the attention maps for public use and will release it once it is cleaned and tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants