-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add early stop module #301
Conversation
…4Health into sa_early_stop
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few things we definitely want to make sure we think about carefully. Specifically, I just want to make sure we aren't going to have a lot of additional memory overhead with the way we're doing snapshotting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left some high-level ish comments. Still lacking a lot of context in this library :)
fl4health/utils/early_stopper.py
Outdated
self.best_score: float | None = None | ||
self.snapshot_ckpt: dict[str, Any] = {} | ||
|
||
self.default_snapshot_attrs: dict = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is T
generic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good from my perspective, though I definitely don't have full context.
I added some commentary about raise NotImplementedError
in an abstractmethod
of a class that inherits from ABC
. Hope it clears it up -- my bad for the confusion as I too was a bit confused :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes you applied look great. There are a few refactors that I would suggest before merging. Mostly just some shifting of responsibility out of the BasicClient class 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge from my perspective too!
PR Type
[Feature]
Short Description
Clickup Ticket(s): https://app.clickup.com/t/8688wzkuk , https://app.clickup.com/t/860qxm622
Integrated an early stopping module as a plug-in for all clients. After a specified number of training steps, the module computes the evaluation loss. If the loss improves compared to previous evaluations, it saves a snapshot of the model's key attributes, enabling the model to restore these attributes when the stopping criteria are met.
Tests Added
Added a series of tests for snapshot modules to ensure they are saved and loaded correctly as intended.