LearningRateFinder not working with CLI optimizers #16787

rusmux opened this issue Feb 16, 2023 · 2 comments

bug Something isn't working tuner


rusmux commented Feb 16, 2023

Bug description

LearningRateFinder does not update the optimizer if it is defined from the CLI or yaml config file.

For example, I define in train.yaml:

  class_path: torch.optim.AdamW
    lr: 1.5e-3

And I set the callback:


At the start, It finds the best learning rate:

Screenshot 82

But after that, it still uses the learning rate I provided:

Screenshot 83

I also tried to do it manually like that:

Screenshot 84

But I had the same result.

How to reproduce the bug

Define an optimizer in a yaml config file. Add the `LearningRateFinder` callback.

Error messages and logs

# Error messages and logs here please


More info

I think, the problem is specific in how and when optimizers and schedulers are instantiated. Because I run the above code, but only for batch size, and it worked as expected:

Screenshot 85

It used the found batch size in training.

For now, as I understand, the way to use LearningRateFinder is to manually define configure_optimizers() in LightningModule. But this way I can't change the optimizer from the yaml config file.

weicao1990 commented Mar 8, 2023

hi, I also faced such issue. My solution is to add before_fit function to your customized CLI class.

def before_fit(self):
    tuner = Tuner(self.trainer)
    tuner.lr_find(self.model, datamodule=self.datamodule)

In this way, pl will execute configure_optimizers after obtaining the optimal LR. Otherwise if we use LRFinder callback, configure_optimizers will not be executed after finding the optimal LR.

bug Something isn't working tuner
