Skip to content

Commit

Permalink
Need better way to organize the estimations.
Browse files Browse the repository at this point in the history
  • Loading branch information
liuliu committed Dec 11, 2024
1 parent e40d046 commit 3bff50d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion lib/nnc/mfa/v2/AttentionDescriptor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,7 @@ std::vector<AttentionParameterRow> AttentionDescriptor::forwardMixed(MTL::Device
if (device->supportsFamily(MTL::GPUFamily(1009))) {
return {
AttentionParameterRow(32, 16, 128, 16, { AttentionOperand::Q, AttentionOperand::O }),
AttentionParameterRow(96, 16, 128, 32, { AttentionOperand::Q, AttentionOperand::O }),
AttentionParameterRow(64, 16, 128, 32, { AttentionOperand::Q, AttentionOperand::O }),
AttentionParameterRow(160, 32, 128, 32, { AttentionOperand::O }),
AttentionParameterRow(224, 32, 128, 32, { AttentionOperand::Q }),
AttentionParameterRow(384, 32, 128, 32, {})
Expand Down

0 comments on commit 3bff50d

Please sign in to comment.