You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to determine whether our fit errors are similar to Arman's reported results, and those by Bagheri et al. 2013. We also want to investigate if we can significantly relax criteria for data to be included for validation, and isolating a portion of this data as a holdout set. We technically don't need to only consider permanent stations, just any station with sufficient data across multiple years.
Actual testing may take an extended amount of time (and involve comparing our results to those of a Gaussian Process Regression), so the goal of this issue is to merge the preliminary CountMatch with master, then create a new Sandbox Branch ecosystem to test ways for validation testing and hyperparameter tuning.
Merge countmatch branch with master (don't delete countmatch).
Rebase sandbox by master.
Set up a new notebook for validation testing. Create a CountMatch model mini-pipeline that allows us to vary hyperparameters like the number of neighbours considered or the minimum requirements of a permanent count station.
Perform preliminary experiments for validation.
The text was updated successfully, but these errors were encountered:
Old Charles is correct - this is now partly solved by the latest commits in #14, but even better - it's in function form rather than notebook form.
Preliminary results suggest that the annual growth factor is by far the most sensitive parameter governing the AADT predictions, so we'll have to think more about #26 before embarking on a full hyperparameter estimation journey.
(Hyperparameter estimation also takes many hours to run a single experiment. Since our work is embarrassingly parallel, we should consider multiprocessing solutions to speed up the work. The simple thing to do is to multi-thread the hyperparameter tuning experiments. The more involved solution is to multi-thread the estimator - the more lucrative one to multi-thread countmatch. We may also finally want to spin up a cluster to do some of this work...
We need to determine whether our fit errors are similar to Arman's reported results, and those by Bagheri et al. 2013. We also want to investigate if we can significantly relax criteria for data to be included for validation, and isolating a portion of this data as a holdout set. We technically don't need to only consider permanent stations, just any station with sufficient data across multiple years.
Actual testing may take an extended amount of time (and involve comparing our results to those of a Gaussian Process Regression), so the goal of this issue is to merge the preliminary CountMatch with master, then create a new Sandbox Branch ecosystem to test ways for validation testing and hyperparameter tuning.
countmatch
branch withmaster
(don't deletecountmatch
).sandbox
bymaster
.The text was updated successfully, but these errors were encountered: