-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename input file parameters #488
Rename input file parameters #488
Conversation
d559c7b
to
75ecec7
Compare
75ecec7
to
86f6604
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
However, I found some spots where you missed renaming bed
to target_bed
and some other minor things and inconsistencies.
I've got some more thoughts:
- If I understood correctly,
target_bed
is only used for SNV calling. In that case, it should probably be calledsnv_calling_target_bed
(or..._regions
, see one of the line comments) - Maybe it makes more sense to change
skip_methylation_analysis
toskip_methylaytion_pileup
as that is all that subworkflow does. Or do we expect it to gain more functionality later on.
CHANGELOG.md
Outdated
| `bed` | `target_bed` | | ||
| `hificnv_xy` | `hificnv_expected_xy_cn` | | ||
| `hificnv_xx` | `hificnv_expected_xx_cn` | | ||
| `hificnv_exclude` | `hificnv_excluded_regions` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be hificnv_exluded_bed
instead? The other bed param is called target_bed
and not target_regions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, another parameter is called par_regions
. So maybe target_bed
should be target_regions
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Third thought: "PAR regions" is redundant, as the "R" in "PAR" already means "region". So maybe it's better to use _bed
for all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both deepvariant and dipcall refer to them as PAR regions, so I'd prefer keeping it that way. Even though technically it's redundant (Swedish CD-skiva is another example).
But I think we could have target_regions
instead of target_bed
.
| `score_config_snv` | `genmod_score_config_snvs` | | ||
| `score_config_sv` | `genmod_score_config_svs` | | ||
| `parallel_alignments` | `alignement_processes` | | ||
| `svdb_dbs` | svdb_sv_databases`` | | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe variant_consequences_snv
should be variant_consequences_snvs
. The corresponding param is variant_consequences_svs
and the similar param for genmod is genmod_score_config_snvs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good!
CHANGELOG.md
Outdated
@@ -119,6 +119,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
| `--validationSkipDuplicateCheck` | | | |||
| `--validationS3PathCheck` | | | |||
| `--monochromeLogs` | `--monochrome_logs` | | |||
| `skip_short_variant_calling` | `skip_snv_calling` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All parameters below are missing the --
docs/parameters.md
Outdated
| `variant_catalog` | A variant catalog json-file for stranger | `string` | | | | | ||
| `echtvar_snv_databases` | A csv file with echtvar databases to annotate SNVs with | `string` | | | | | ||
| `svdb_sv_databases` | Databases used for structural variant annotation in vcf format. <details><summary>Help</summary><small>Path to comma-separated file containing information about the databases used for structural variant annotation.</small></details>| `string` | | | | | ||
| `stranger_repeat_catalog` | A variant catalog json-file for stranger | `string` | | | | | ||
| `variant_consequences_snv` | File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SNVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html | `string` | | | | | ||
| `variant_consequences_svs` | File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html | `string` | | | | | ||
| `vep_cache` | A path to the VEP cache location | `string` | | | | | ||
| `bed` | A BED file with regions of interest, used to limit short variant calling. | `string` | | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `bed` | A BED file with regions of interest, used to limit short variant calling. | `string` | | | | | |
| `target_bed` | A BED file with regions of interest, used to limit short variant calling. | `string` | | | | |
docs/usage.md
Outdated
| Parameter | Description | | ||
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| `genmod_score_config_svs` | Used by GENMOD when ranking variants. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/rank_model_snv.ini). | | ||
| `genmod_reduced_penetrance` | A list of loci that show [reduced penetrance](https://medlineplus.gov/genetics/understanding/inheritance/penetranceexpressivity/) in people. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/reduced_penetrance.tsv) | | ||
|
||
`--skip_rank_variants`. | ||
|
||
## Other highlighted parameters | ||
|
||
- Limit SNV calling to regions in BED file (`--bed`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Limit SNV calling to regions in BED file (`--bed`). | |
- Limit SNV calling to regions in BED file (`--target_bed`). |
One more thing: The workflow parameter for skipping short-variant calling is |
Co-authored-by: Daniel Schmitz <[email protected]>
Co-authored-by: Daniel Schmitz <[email protected]>
Yeah. Here I wanted to rename |
Right now target_bed is used to limit SNV calling and methylation pileups, making the SNV-calling and methylation pileups quicker for targeted analysis. It's also passed to mosdepth to calculate depth per region. Ideally this bed would limit SV calling to the target regions, but now it's only filtering the results. In Stockholm we will need to be able to output both variants from the whole genome, and filtered variants at the same time. I believe this would not be the usual use case for others, so perhaps we could have for example both a
Yes, although it would be more correct currently, I hope we can at least add some visualisation track outputs to the workflow fairly soon. Even so, |
Regarding your first point: I think calling everything "SNV" is preferable. Even if it isn't technically correct I think it's clear what it means and I like conciseness. |
f6644aa
to
ea1de63
Compare
ea1de63
to
a5bd0b3
Compare
Let's go with
I made a separate issue for finer-grained control over regions (#503). Renaming
Let's go with your suggestion of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind fixing the alignment where I marked it? 😬
Co-authored-by: Daniel Schmitz <[email protected]>
Co-authored-by: Daniel Schmitz <[email protected]>
Co-authored-by: Daniel Schmitz <[email protected]>
Thanks! |
Fixed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Anything else you like to add @Schmytzi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Closes #324.
Tries to make the file inputs to the pipeline more descriptive and uniform, e.g. prefacing the parameter with the tool name it belongs to.
skip_short_variant_calling
toskip_snv_calling
skip_assembly_wf
toskip_genome_assembly
skip_mapping_wf
toskip_alignment
skip_methylation_wf
toskip_methylation_analysis
skip_phasing_wf
toskip_phasing
variant_caller
tosnv_caller
parallel_snv
tosnv_calling_processes
cadd_prescored
tocadd_prescored_indels
snp_db
toechtvar_snv_databases
variant_catalog
tostranger_repeat_catalog
bed
totarget_bed
hificnv_xy
tohificnv_expected_xy_cn
hificnv_xx
tohificnv_expected_xx_cn
hificnv_exclude
tohificnv_excluded_regions
reduced_penetrance
togenmod_reduced_penetrance
score_config_snv
togenmod_score_config_snvs
score_config_sv
togenmod_score_config_svs
parallel_alignments
toalignement_processes
svdb_dbs
tosvdb_sv_databases
PR checklist
nf-core pipelines lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).