Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Parallelization of DistToPoint metrics #217

Merged
merged 15 commits into from
Mar 7, 2024

Conversation

drewoldag
Copy link
Collaborator

Problem & Solution Description (including issue #)

Initial parallelization of PIT metric. Not fully tested yet.
Still need to parallelize CDELoss and Brier.

Code Quality

  • My code follows the code style of this project
  • I have written unit tests or justified all instances of #pragma: no cover; in the case of a bugfix, a new test that breaks as a result of the bug has been added
  • My code contains relevant comments and necessary documentation for future maintainers; the change is reflected in applicable demos/tutorials (with output cleared!) and added/updated docstrings use the NumPy docstring format
  • Any breaking changes, described above, are accompanied by backwards compatibility and deprecation warnings

@drewoldag drewoldag self-assigned this Feb 29, 2024
@drewoldag drewoldag linked an issue Feb 29, 2024 that may be closed by this pull request
Copy link

codecov bot commented Feb 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (96b3e13) to head (515f40e).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #217   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           37        37           
  Lines         2646      2707   +61     
=========================================
+ Hits          2646      2707   +61     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@eacharles eacharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. One issue that that there is now a potential circular dependency:

qp.Ensemble.py -> from qp.metrics import quick_moment

qp.metrics.concrete_metric_classes.py -> from qp.ensemble import Ensemble

which means that we can't have
from .concrete_metric_classes import *
in metrics/init.py

We should think a bit about this

@drewoldag
Copy link
Collaborator Author

It seems like we can clean up the imports in a couple ways. Either would be good. Both would be better in my opinion.

  1. from qp.ensemble import Ensemle isn't required in concrete_metrics_classes.py. It's only used for type hints on the return type.
  2. I might argue that the moment_partial method in the Ensemble class isn't really required, since there are several ways to calculate the moments of an ensemble, (i.e. concrete_metric_classes' MomemMetric(moment_order=X).evaluate(ensemble)) - so we could remove Ensemble's dependency from qp.metrics import quick_moment.

@drewoldag
Copy link
Collaborator Author

I sketched out the UML, and I don't think we have a circular dependency, but I might be wrong about that. I think it's something like this:

quick_moment
^ ^
| calculate_moment
| ^
Ensemble <- concrete_metrics

Given the previous comment, I think we can cut the dependency that concrete_metrics takes on Ensemble and potentially the dependency that Ensemble takes on quick_moment. Then the UML would be:

quick_moment
^
calculate_moment
^
concrete_metrics

And Ensemble wouldn't have any metric-related dependencies.

@drewoldag drewoldag requested a review from OliviaLynn March 6, 2024 00:09
@eacharles
Copy link
Collaborator

Yes, to cleaning up the imports as you described.

@drewoldag
Copy link
Collaborator Author

drewoldag commented Mar 7, 2024

I can easily pull out the concrete_metrics from qp import Ensemble. I'm less confident about removing the dependency and code in qp.Ensemble.moment_partial. I don't know where that is used, but I can create an issue to do it.

Created issue #219 to track the removal of moment_partial.

drewoldag and others added 6 commits March 6, 2024 16:04
…218)

* WIP - Initial commit to support writing out a dictionary collection of ensembles.

* Implementing a function to read dictionaries of qp ensembles from hdf5 files.

* Removing files that were accidentally committed.

* Adding test coverage for read and write dict.

* Marking test as skipped until the required tables_io work is released.

* Unskipping a test now that tables_io has new release.

* Adding pragma no cover to value error.
@drewoldag drewoldag requested a review from eacharles March 7, 2024 00:16
Copy link
Collaborator

@eacharles eacharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!

@drewoldag drewoldag merged commit 02ceed8 into main Mar 7, 2024
6 checks passed
@drewoldag drewoldag deleted the issue/215/dist-to-point-metric-parallelization branch March 7, 2024 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallelize metrics
2 participants