-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable bagging if asked #263
base: master
Are you sure you want to change the base?
Disable bagging if asked #263
Conversation
Codecov Report
@@ Coverage Diff @@
## master #263 +/- ##
==========================================
+ Coverage 82.13% 82.15% +0.01%
==========================================
Files 133 133
Lines 4938 4942 +4
==========================================
+ Hits 4056 4060 +4
Misses 882 882
Continue to review full report at Codecov.
|
@maikia can you check if it helps with our problems? thx thx a lot @albertcthomas for the patch. |
The patch is fine, thx @albertcthomas. The problem I see for ramp-board is that the event will need to know about the flag for 1) setting --no-bagging when ramp-test is called on the server and 2) taking care of using the mean score on the leaderboards. I see two solutions: 1) adding a bagging field to the Event table in the DB which needs to be set on the update_event form or 2) putting the bagging field in problem.py. 1 is somewhat cleaner, but it requires migration. 2 would make sense because i) the attribute belongs to the problem, not to the event, from a DB point of view, and problem.py is in fact functioning as the Problem table in the DB, ii) it would not require modifying the ramp-test script in the worker. But 2 would also mean that we should re-think what we do in this PR: do we have both a CI option and the field in problem.py, or only the latter. In both cases, we'll need to code taking a score from the mean instead of the csv containing the bagged scores |
I would also like to urge ourselves to merge advanced into master so we can get back maintaining only one branch. |
Great, thanks @albertcthomas for taking care of this. |
@albertcthomas your solution works well for us when using As for how to use it by the ramp-board, I would opt for the 2nd option of @kegl |
OK @maikia @albertcthomas in this case we should add this to this PR: having a
in problem.py should have the same effect as the Then the only thing to implement in ramp-board is to catch when there are no bagged results, and use the mean instead in the leaderboard. |
Sometimes, the bagging CV scores can be slow and/or lead to a MemoryError. This of course depends on the dataset, the problem and the machine I use.
This PR adds a
no-bagging
flag to theramp-test
command to avoid computing the bagging CV scores.@agramfort is also interested in this for an upcoming challenge I believe.