Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LM FT OFT #495

Open
junxnone opened this issue Jan 16, 2025 · 0 comments
Open

LM FT OFT #495

junxnone opened this issue Jan 16, 2025 · 0 comments

Comments

@junxnone
Copy link
Owner

junxnone commented Jan 16, 2025

OFT/BOFT

  • OFT 利用正交矩阵对预训练权重矩阵进行乘法变换以实现高效微调,新矩阵在保持预训练权重矩阵不变的情况下适应新数据,最后将两者相乘得到结果。
  • BOFT 是 OFT 的推广,使用蝶形分解进一步提高参数效率和微调灵活性,OFT 可视为 BOFT 的特殊情况。
  • Hyperspherical Energy: 超球能量被定义为同一层中所有成对神经元之间的超球相似度(例如余弦相似度)之和,它反映了神经元在单位超球面上的均匀程度

Arch

OFT & COFT

  • 对于一个预训练的全连接层 $W^{0}$ ,OFT试图找到一个正交矩阵 $R$ ,使得经过微调后的权重矩阵 $W = R\cdot W^{0}$ 满足 $\left| HE(W)-HE\left(W^{0}\right)\right| = 0$ ,其中 $HE(\cdot)$ 表示权重矩阵的超球能量。
  • 在实现过程中,将正交矩阵 $R$ 初始化为单位矩阵,以确保微调从预训练权重开始。

image

BOFT

image

权重合并

image

Reference

@junxnone junxnone changed the title Hot LM Tuning OFT LM FT OFT Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant