Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add early stop module #301

Merged
merged 31 commits into from
Jan 25, 2025
Merged

Add early stop module #301

merged 31 commits into from
Jan 25, 2025

Conversation

sanaAyrml
Copy link
Collaborator

@sanaAyrml sanaAyrml commented Dec 5, 2024

PR Type

[Feature]

Short Description

Clickup Ticket(s): https://app.clickup.com/t/8688wzkuk , https://app.clickup.com/t/860qxm622

Integrated an early stopping module as a plug-in for all clients. After a specified number of training steps, the module computes the evaluation loss. If the loss improves compared to previous evaluations, it saves a snapshot of the model's key attributes, enabling the model to restore these attributes when the stopping criteria are met.

Tests Added

Added a series of tests for snapshot modules to ensure they are saved and loaded correctly as intended.

@sanaAyrml sanaAyrml marked this pull request as draft January 2, 2025 19:09
@sanaAyrml sanaAyrml changed the title Sa early stop Add early stop module Jan 9, 2025
@sanaAyrml sanaAyrml marked this pull request as ready for review January 9, 2025 10:00
Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things we definitely want to make sure we think about carefully. Specifically, I just want to make sure we aren't going to have a lot of additional memory overhead with the way we're doing snapshotting.

fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@nerdai nerdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left some high-level ish comments. Still lacking a lot of context in this library :)

fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
self.best_score: float | None = None
self.snapshot_ckpt: dict[str, Any] = {}

self.default_snapshot_attrs: dict = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is T generic?

fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/utils/logging.py Outdated Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
@sanaAyrml sanaAyrml requested review from nerdai and emersodb January 23, 2025 21:26
Copy link
Collaborator

@nerdai nerdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good from my perspective, though I definitely don't have full context.

I added some commentary about raise NotImplementedError in an abstractmethod of a class that inherits from ABC. Hope it clears it up -- my bad for the confusion as I too was a bit confused :)

fl4health/clients/basic_client.py Show resolved Hide resolved
fl4health/utils/early_stopper.py Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
fl4health/utils/snapshotter.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes you applied look great. There are a few refactors that I would suggest before merging. Mostly just some shifting of responsibility out of the BasicClient class 🙂

fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/clients/basic_client.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
fl4health/utils/early_stopper.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge from my perspective too!

@sanaAyrml sanaAyrml merged commit 1866aab into main Jan 25, 2025
6 checks passed
@sanaAyrml sanaAyrml deleted the sa_early_stop branch January 25, 2025 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants