Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized ia3 #81

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Generalized ia3 #81

wants to merge 18 commits into from

Conversation

IanMagnusson
Copy link
Contributor

@IanMagnusson IanMagnusson commented Sep 14, 2022

What's Here

Moves a more generalized IA3 adaptor implementation to Tango (PR pending) and provides an example script for how to use it in Catwalk.

Results on piqa

While hardly impressive results, the IA3 implementation manages to reduce validation loss and recover much of the accuracy of the fully tuned equivalent for all the architectures for which default configurations are provided. The gpt-j-6b full tune is not able to run on a single gpu while the IA3 training is able to fit due to having far fewer optimizer states for its fewer trainable parameters.

Screen Shot 2022-09-13 at 6 57 54 PM

@IanMagnusson IanMagnusson marked this pull request as ready for review September 16, 2022 22:28
@IanMagnusson IanMagnusson requested a review from dirkgr September 16, 2022 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants