Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multiple param groups #1609

Merged
merged 9 commits into from
Mar 27, 2024

Conversation

AlbinSou
Copy link
Collaborator

@AlbinSou AlbinSou commented Mar 1, 2024

Hey, I just put a prototype for now of the kind of changes I want to bring.

First of all, quick reminder of why it's not as easy as it seems to support multiple parameter groups:

    1. In continual learning, it is common to add parameters to the model upon encountering new data, but if the user adds parameters, it is not easy to know in what parameter group this new parameter should be assigned to, because the parameter group information is contained inside the optimizer whereas the added parameters are added to the model without additional info on what parameter group they should be in.
    1. Now, although the default way to make changes in the model parameters in avalanche is to use DynamicModules, it is technically possible to manually add parameters to nn.Modules, which makes tracking of new parameters even harder.
    1. To complexify the things a bit, pytorch optimizer like to store the parameters in a very unconvenient datastructure which is a list of parameters, not linking directly each parameter to it's name

How I plan to remedy to that:

  • Ideally, we want to have native support for a normal use of multiple parameter groups. If the user puts different parameter groups into the optimizer, he should not have to worry about them being broken. Of course, right now it's not the case and what we do is just to reset the optimizer at the start of training to setup the name, param_id mapping later used for update, and we put an assertion saying that if the optimizer has more than one param group it fails and it requires the user to implement it's own make_optimizer function (which is a pity because then using an existing method with different parameter groups requires to copy the strategy or create a new plugin).

  • Since users can add parameters both through DynamicModules and Manually, it's better to use a general method that works with any nn.Module object

  • Provide an automatic and natural way based on the natural module hierarchy to assign parameter groups to new parameters. Leaving more complex logic (which should not happen so often ?) to custom implementations.

The idea is the following, we let the user define it's optimizer with it's own param groups. Then we go over the model named_parameters() and store each name into a tree structure (which is a copy of the torch module tree structure). Each node (not necessarily parameter, a parameter is a leaf node) will contain information about a set of parameter groups that it's children nodes belong to. Then, when a new node is added (new parameter), it's group is inferred based on the group of it's parent node. If his parent node has multiple groups, it means that this node's parameter group cannot be inferred and returns an error asking the user to either change the module creation process or implement his own optimizer update logic.

@AntonioCarta I just provide for now a playground in test/test_optimizer.py that you can run as-is to see how it will work, I will implement the actual thing later. Let me know if you agree on the idea first

@AlbinSou AlbinSou requested a review from AntonioCarta March 1, 2024 14:23
@coveralls
Copy link

coveralls commented Mar 1, 2024

Pull Request Test Coverage Report for Build 8426226675

Details

  • 169 of 381 (44.36%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.06%) to 51.751%

Changes Missing Coverage Covered Lines Changed/Added Lines %
avalanche/models/dynamic_optimizers.py 119 135 88.15%
tests/test_optimizer.py 48 244 19.67%
Files with Coverage Reduction New Missed Lines %
avalanche/models/dynamic_optimizers.py 1 88.46%
Totals Coverage Status
Change from base Build 8098020118: -0.06%
Covered Lines: 14863
Relevant Lines: 28720

💛 - Coveralls

tests/models/test_models.py Outdated Show resolved Hide resolved
@AlbinSou
Copy link
Collaborator Author

AlbinSou commented Mar 6, 2024

Commenting about this issue here so that I can keep track of it:

Right now the update_optimizer utils works like the following: it takes a dictionary of {name: param} mapping of previously optimized parameters, as well as one for new parameters. It works fine if we assume that one same parameter is not going to change name, but this issue can happen, for instance in the case a model is wrapped inside another module while keeping the same parameters.

In the case that parameters change name, there is an issue in the current update_optimizer is that it considers parameters that have changed their name as new parameters and resets their state.

The OptimizedParameterStructure works well for that because it identifies parameters inside the current optimizer and matches them to a name that correspond to the same parameter id.

Maybe I could abandon the use of optimized_param_id dict and fully switch to only using OptimizedParameterStructure, it requires some more work but I think it could work way better and also remove the requirement for storing the previously optimized params in the strategy as it's currently done

@AlbinSou
Copy link
Collaborator Author

@AntonioCarta I added a test in which I rename plus add one parameter. This test is the typical case where it should not work. But I noticed an additional problem. When some parameter is both renamed and expanded, it can absorb the parameter group of the other parameters without any notice (because in that case the expanded parameter is considered as a new parameter). This is quite an edge case but that could be problematic, because I really don't know how to even notice with a warning this kind of event happens when both the name and parameter id change there is no way to tell this happens.

@AntonioCarta
Copy link
Collaborator

When some parameter is both renamed and expanded

Usually expanded parameters are not renamed, right?

I don't think we can really fix this (there is no correct behavior here), but maybe we can improve debugging this issues. For example, can we add a verbose mode that prints the parameters which have been adapted/expanded/renamed... ? Would this make it easier to find potential bugs?

@AlbinSou
Copy link
Collaborator Author

When some parameter is both renamed and expanded

Usually expanded parameters are not renamed, right?

I don't think we can really fix this (there is no correct behavior here), but maybe we can improve debugging this issues. For example, can we add a verbose mode that prints the parameters which have been adapted/expanded/renamed... ? Would this make it easier to find potential bugs?

No, usually they are not renamed, but if you rename them at the same temporality than they are expanded, it can be a problem. It should be fixed however if you call strategy.make_optimizer() manually after renaming and before the parameter expansion

@AlbinSou AlbinSou marked this pull request as ready for review March 21, 2024 13:58
avalanche/models/dynamic_optimizers.py Outdated Show resolved Hide resolved
avalanche/models/dynamic_optimizers.py Outdated Show resolved Hide resolved
avalanche/models/dynamic_optimizers.py Outdated Show resolved Hide resolved
examples/multihead_param_groups.py Outdated Show resolved Hide resolved
examples/multihead_param_groups.py Outdated Show resolved Hide resolved
examples/multihead_param_groups.py Outdated Show resolved Hide resolved
tests/test_optimizer.py Show resolved Hide resolved
@AlbinSou
Copy link
Collaborator Author

FYI, the example is failing right now:

ValueError: This function only supports single parameter groups.If you need to use multiple parameter groups, you can override `make_optimizer` in the Avalanche strategy.

Maybe you didn't change the templates to not use the old reset_optimizer?

I tried it and it works inside this pull request

@AntonioCarta
Copy link
Collaborator

FYI, the example is failing right now:

ValueError: This function only supports single parameter groups.If you need to use multiple parameter groups, you can override `make_optimizer` in the Avalanche strategy.

Maybe you didn't change the templates to not use the old reset_optimizer?

I tried it and it works inside this pull request

yes, it was an issue in my env. Now it's working.

@AlbinSou
Copy link
Collaborator Author

@AntonioCarta I think it's ready to merge. I made a few checks on the continual learning baselines to check that I obtain the same results with old and new implementation. I also added a test where the user adds manually some parameters to the parameter group he wants and it works fine.

@AntonioCarta
Copy link
Collaborator

Thank you! Everything looks good now.

@AntonioCarta AntonioCarta merged commit 6e5e3b2 into ContinualAI:master Mar 27, 2024
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants