Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write code to generate ensemble #76

Open
nickreich opened this issue Oct 1, 2024 · 2 comments
Open

write code to generate ensemble #76

nickreich opened this issue Oct 1, 2024 · 2 comments

Comments

@nickreich
Copy link
Member

We will want a script that can run (ideally automatically, via CI) and take all of the model_output files for a given week, and build an ensemble for them.

The script could live in the src folder, or it could live external to this repo.

The script should generate the file and save it in an appropriate hub-ensemble folder or some such place. I will file an issue to create the CI to "submit" the file as a separate issue.

@elray1
Copy link
Collaborator

elray1 commented Oct 7, 2024

Have we discussed what we want this ensemble to do, in terms of statistical methods?

@elray1
Copy link
Collaborator

elray1 commented Oct 7, 2024

We decided that this ensemble will be a linear pool:

  • for mean predictions, submit the mean of the means (and for any team that didn't submit means, extract means from the submitted samples)
  • for sample predictions, from each of the M contributing models choose 100 / M samples at random, randomly distributing any remainder in the number of samples across the models

Misc. other ideas for later analyses:

  • what if we randomly selected the samples rather than stratifying by model? our guess is that this will have larger MC variability
  • what if we repeat the random selection multiple times? what is variability in ensemble score? note, we could also bootstrap individual model samples to try to get at this
  • what if we took all M*100 samples?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants