Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage modularity, round 2 #418

Open
mwhamgenomics opened this issue Sep 16, 2019 · 0 comments
Open

Stage modularity, round 2 #418

mwhamgenomics opened this issue Sep 16, 2019 · 0 comments

Comments

@mwhamgenomics
Copy link
Collaborator

mwhamgenomics commented Sep 16, 2019

At the moment, pipelines are split into stages but handling of intermediate files is messy and it's hard to modify pipelines. It would be better if:

  • stages take generic parameters for input/output files, not predetermined file paths
  • file paths are defined by the Pipeline object, or whatever the stages are being used by
  • no wildcards - part from date-stamped files in BCBio, we should know what every file will be called
    • this might make output_files.yaml redundant
  • checks for whether a stage should run uses presence of a reporting app stage is used as well as presence of files, not instead

We should also be able to mock a dataset, patch executor, run a pipeline and assert what all the bash commands were.

We should also make Stage objects as lightweight as possible, ideally removing their access to Dataset - see #395 for reasons why.

This might be a good opportunity to look at sciluigi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant