Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(EstimatorReport): Make a deep copy of fitted estimator in constructor to avoid side-effect #1085

Merged
merged 5 commits into from
Jan 10, 2025

Conversation

augustebaum
Copy link
Contributor

@augustebaum augustebaum commented Jan 10, 2025

Closes #1081

When initializing the EstimatorReport, we now deepcopy the input estimator unless it is cloned.
This way, if the user re-fits their estimator after giving it as input to EstimatorReport, this will have no impact on the report itself.

If deepcopy doesn't succeed, we keep the previous behaviour and issue a warning about the potential side-effects.

@augustebaum augustebaum changed the title 1081 frozen estimator fix(EstimatorReport): Use sklearn's FrozenEstimator Jan 10, 2025
@augustebaum augustebaum force-pushed the 1081-frozen-estimator branch from 18002cf to 055e9af Compare January 10, 2025 15:19
Copy link
Contributor

github-actions bot commented Jan 10, 2025

Documentation preview @ de01a1f

@glemaitre glemaitre self-requested a review January 10, 2025 15:26
@glemaitre
Copy link
Member

So putting here some comments that we had in a IRL discussion with @augustebaum. I did not evaluate the problem properly: for FrozenEstimator solution to work, our user need to pass us the estimator. Otherwise, we still have the original side-effect.

The decision here is to trigger a deepcopy has sometimes done in scikit-learn. If the deepcopy does not exist then we raise warnings mentioning the potential side-effect and the existence of the FrozenEstimator. The latter might happen with some third-party library if I recall properly (e.g. cuml)

@augustebaum augustebaum force-pushed the 1081-frozen-estimator branch from 055e9af to 5b9b55a Compare January 10, 2025 16:07
@augustebaum augustebaum changed the title fix(EstimatorReport): Use sklearn's FrozenEstimator fix(EstimatorReport): Deepcopy estimator Jan 10, 2025
@augustebaum augustebaum marked this pull request as ready for review January 10, 2025 16:10
Copy link
Contributor

github-actions bot commented Jan 10, 2025

Coverage

Coverage Report for backend
FileStmtsMissCoverMissing
venv/lib/python3.12/site-packages/skore
   __init__.py120100% 
   __main__.py8180%19
   exceptions.py30100% 
venv/lib/python3.12/site-packages/skore/cli
   __init__.py50100% 
   cli.py33385%104, 111, 117
   color_format.py43390%35–>40, 41–43
   launch_dashboard.py261539%36–57
   quickstart_command.py14750%37–51
venv/lib/python3.12/site-packages/skore/item
   __init__.py210100% 
   cross_validation_item.py1371093%27–42, 370
   item.py411368%85, 88, 92–112
   item_repository.py42293%12–13
   media_item.py70494%15–18
   numpy_array_item.py25193%15
   pandas_dataframe_item.py34195%15
   pandas_series_item.py34195%15
   polars_dataframe_item.py32194%15
   polars_series_item.py27194%15
   primitive_item.py27292%13–15
   sklearn_base_estimator_item.py33195%15
   skrub_table_report_item.py10186%11
venv/lib/python3.12/site-packages/skore/persistence
   __init__.py00100% 
   abstract_storage.py22195%130
   disk_cache_storage.py33195%44
   in_memory_storage.py200100% 
venv/lib/python3.12/site-packages/skore/project
   __init__.py30100% 
   create.py52888%116–122, 132–133, 140–141
   load.py23389%43–45
   open.py140100% 
   project.py64491%135, 149, 183, 187
venv/lib/python3.12/site-packages/skore/sklearn
   __init__.py40100% 
   find_ml_task.py35195%41–>49, 50
   types.py20100% 
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
   __init__.py100100% 
   base.py76298%87–88
   metrics_accessor.py198298%131, 266
   report.py174197%163–>169, 165–>167, 168, 171–>173, 177–>181, 426–>431
   utils.py11110%1–19
venv/lib/python3.12/site-packages/skore/sklearn/_plot
   __init__.py40100% 
   precision_recall_curve.py126297%200–>203, 313–314
   prediction_error.py75099%289–>297
   roc_curve.py95394%156, 167–>170, 223–224
   utils.py770100% 
venv/lib/python3.12/site-packages/skore/sklearn/cross_validation
   __init__.py20100% 
   cross_validation_helpers.py47490%104–>136, 123–126
   cross_validation_reporter.py35195%177
venv/lib/python3.12/site-packages/skore/sklearn/cross_validation/plots
   __init__.py00100% 
   compare_scores_plot.py29192%10, 45–>48
   timing_plot.py29194%10
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py34294%15–16
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17378%16–18, 80
   high_class_imbalance_warning.py18288%16–18
   random_state_unset_warning.py11187%15
   shuffle_true_warning.py9091%44–>exit
   stratify_is_set_warning.py11187%15
   time_based_column_warning.py22189%17, 69–>exit
   train_test_split_warning.py5180%21
venv/lib/python3.12/site-packages/skore/ui
   __init__.py00100% 
   app.py25571%24, 53–58
   dependencies.py7186%12
   project_routes.py500100% 
venv/lib/python3.12/site-packages/skore/utils
   __init__.py00100% 
   _accessor.py70100% 
   _logger.py21484%14–18
   _show_versions.py310100% 
venv/lib/python3.12/site-packages/skore/view
   __init__.py00100% 
   view.py50100% 
   view_repository.py16283%8–9
TOTAL223413693% 

Tests Skipped Failures Errors Time
352 0 💤 0 ❌ 0 🔥 48.986s ⏱️

skore/src/skore/sklearn/_estimator/report.py Outdated Show resolved Hide resolved
skore/src/skore/sklearn/_estimator/report.py Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Outdated Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Outdated Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Outdated Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Outdated Show resolved Hide resolved
skore/tests/unit/sklearn/test_estimator.py Outdated Show resolved Hide resolved
@glemaitre
Copy link
Member

Here, are a couple of comments but it looks good.

@glemaitre glemaitre changed the title fix(EstimatorReport): Deepcopy estimator fix(EstimatorReport): Make a deep copy of fitted estimator in constructor to avoid side-effect Jan 10, 2025
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks a lot @augustebaum

@glemaitre glemaitre merged commit b177651 into main Jan 10, 2025
19 checks passed
@glemaitre glemaitre deleted the 1081-frozen-estimator branch January 10, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow-up EstimatorReport: freeze the estimator when it is a fitted estimator
2 participants