We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,请问在模型实现代码中为什么只需要换一下ws1和hh、ws2和ww的位置就能实现shuffle的操作呢,在括号里相乘(AxB)和(BxA)应该没有什么区别吧,期待您的解答!
就是下面的代码 q, k, v = rearrange(qkv, 'b (qkv h d) (ws1 hh) (ws2 ww) -> qkv (b hh ww) h (ws1 ws2) d', h=self.num_heads, qkv=3, ws1=self.ws, ws2=self.ws) q, k, v = rearrange(qkv, 'b (qkv h d) (hh ws1) (ww ws2) -> qkv (b hh ww) h (ws1 ws2) d', h=self.num_heads, qkv=3, ws1=self.ws, ws2=self.ws)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
您好,请问在模型实现代码中为什么只需要换一下ws1和hh、ws2和ww的位置就能实现shuffle的操作呢,在括号里相乘(AxB)和(BxA)应该没有什么区别吧,期待您的解答!
就是下面的代码
q, k, v = rearrange(qkv, 'b (qkv h d) (ws1 hh) (ws2 ww) -> qkv (b hh ww) h (ws1 ws2) d', h=self.num_heads, qkv=3, ws1=self.ws, ws2=self.ws)
q, k, v = rearrange(qkv, 'b (qkv h d) (hh ws1) (ww ws2) -> qkv (b hh ww) h (ws1 ws2) d', h=self.num_heads, qkv=3, ws1=self.ws, ws2=self.ws)
The text was updated successfully, but these errors were encountered: