-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility Brainstorm #4
Comments
Also note that I am brainstorming this general idea in this specific repository relating to the STARR grant b/c it is open source |
Thanks for leading this discussion! I like this idea because of
So the profile-generation repo would capture everything that we do in the profiling handbook, and nothing else. Ideally, at some point, this repo would only contain the WDL workflow (or equivalent) used to process the data. The automation question merits a separate discussion, out of scope right now. What's next? Do you want to try this out on this project? @gwaygenomics |
Yeah definitely! Also, depending on the size of the profiles specifically, github can handle data versioning. BBBC will store the raw images?
Depending on the size of the data, I think it could also store processed profiles. Data versioning FTW 🎉
yes, lets try it out! Currently, I don't think the profile processing lives here (do we know where it lives?). So it will be natural to use this strategy here. Another thing to consider is if the analysis should live in the |
Not sure yet; ideally IDR, but it isn't easy to directly access images
Indeed, I don't see any profile processing notes; you'd need to check with Beth.
|
note that I transferred this issue over from https://github.com/broadinstitute/profiling-resistance-mechanisms This repo currently has the closest workflow to what is described above |
@shntnu @jccaicedo @MarziehHaghighi
I was thinking about our recent discussion on github reproducibility a bit more. I am wondering about different potential workflows and can think of some additional potentially helpful setups.
First Setup
The first setup is as I described on Friday:
0.generate-profiles
) that stores code, QC, and profile results.0.generate-profiles
.Potential Alternative?
Perhaps a second setup could separate the processing code and downstream analysis into two distinct repositories. This setup could work well for a couple of reasons.
profiling init
(and then bash scripts would be auto populated).Of course, every project is different, and individual decisions are required. (The same goes for storing the profiles in the actual repo! and public/private repo debate too)
The text was updated successfully, but these errors were encountered: