Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LM FT AdaLoRA #491

Open
junxnone opened this issue Jan 13, 2025 · 0 comments
Open

LM FT AdaLoRA #491

junxnone opened this issue Jan 13, 2025 · 0 comments

Comments

@junxnone
Copy link
Owner

junxnone commented Jan 13, 2025

AdaLoRA

  • 引入了自适应机制, 自适应调整秩,能依据训练动态变化优化模型。
  • 训练前期数据特征复杂,提高秩让模型学习更多信息;训练后期降低秩简化模型,防止过拟合,提升训练效率和模型性能。
  • 因需动态调整秩,增加额外计算开销和时间消耗。但在大规模数据和复杂任务中,虽训练时间增加,却能得到性能更好的模型
  • 通过动态调整秩,能在不同训练阶段优化模型结构,捕捉数据复杂特征,在复杂任务和数据分布变化场景中,模型性能和泛化能力更优。

原理

  • 将预训练权重矩阵的增量更新参数化为奇异值分解的形式: $W = W^{(0)} + \Delta = W^{(0)} + P\Lambda Q$
    • $(P \in \mathbb{R}^{d_1×r})$$(Q \in \mathbb{R}^{r×d_2})$ 分别表示 $(\Delta)$ 的左/右奇异向量
    • 对角矩阵 $(\Lambda \in {R}^{r×r})$ 包含奇异值 $\{{\lambda_i}\}_{1≤i<r}$
    • $r \ll \min(d_1, d_2)$
      提出新的重要性度量指标,结合奇异值和向量来计算每个三元组(包含第个奇异值和对应向量)的重要性得分。

Reference

@junxnone junxnone changed the title Hot LM Tuning AdaLoRA LM FT AdaLoRA Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant