Skip to content

Commit

Permalink
Sarek bcftools normalization (#1682)
Browse files Browse the repository at this point in the history
<!--
# nf-core/sarek pull request

Many thanks for contributing to nf-core/sarek!

Please fill in the appropriate checklist below (delete whatever is not
relevant).
These are the most common things requested on pull requests (PRs).

Remember that PRs should be made against the dev branch, unless you're
preparing a pipeline release.

Learn more about contributing:
[CONTRIBUTING.md](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md)
-->

## PR checklist

- [x] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add
tests!
- [x] If you've added a new tool - have you followed the pipeline
conventions in the [contribution
docs](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md)
- [ ] If necessary, also make a PR on the nf-core/sarek _branch_ on the
[nf-core/test-datasets](https://github.com/nf-core/test-datasets)
repository.
- [ ] Make sure your code lints (`nf-core lint`).
- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker
--outdir <OUTDIR>`).
- [ ] Check for unexpected warnings in debug mode (`nextflow run .
-profile debug,test,docker --outdir <OUTDIR>`).
- [ ] Usage Documentation in `docs/usage.md` is updated.
- [x] Output Documentation in `docs/output.md` is updated.
- [x] `CHANGELOG.md` is updated.
- [x] `README.md` is updated (including new tool citations and
authors/contributors).

---------

Co-authored-by: JC-Delmas <[email protected]>
Co-authored-by: Jean-Charles Delmas <[email protected]>
Co-authored-by: Maxime U Garcia <[email protected]>
Co-authored-by: Friederike Hanssen <[email protected]>
Co-authored-by: Maxime U Garcia <[email protected]>
  • Loading branch information
6 people authored Jan 13, 2025
1 parent 73cacd5 commit 33b0ba6
Show file tree
Hide file tree
Showing 30 changed files with 2,718 additions and 2,059 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- [1759](https://github.com/nf-core/sarek/pull/1759) - Back to dev
- [1682](https://github.com/nf-core/sarek/pull/1682) - Add `bcftools_norm` in `POST_VARIANTCALLING` for normalization of all vcf files; edit vcf_concatenate_germline subworkflow
- [1760](https://github.com/nf-core/sarek/pull/1760) - Back to dev

### Changed

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ Depending on the options and samples provided, the pipeline can currently perfor
- `Strelka2`
- `TIDDIT`
- `Lofreq`
- Post-variant calling options (`BCFtools concat` for germline vcfs, `BCFtools norm` for all vcfs)
- Variant filtering and annotation (`SnpEff`, `Ensembl VEP`, `BCFtools annotate`)
- Summarise and represent QC (`MultiQC`)

Expand Down Expand Up @@ -183,6 +184,7 @@ We thank the following people for their extensive assistance in the development
- [Szilveszter Juhos](https://github.com/szilvajuhos)
- [Tobias Koch](https://github.com/KochTobi)
- [Winni Kretzschmar](https://github.com/winni2k)
- [Patricie Skaláková](https://github.com/Patricie34)

## Acknowledgements

Expand Down
45 changes: 40 additions & 5 deletions conf/modules/post_variant_calling.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

process {

withName: 'GERMLINE_VCFS_CONCAT'{
withName: 'GERMLINE_VCFS_CONCAT' {
ext.args = { "-a" }
ext.when = { params.concatenate_vcfs }
publishDir = [
Expand All @@ -25,26 +25,61 @@ process {
]
}

withName: 'GERMLINE_VCFS_CONCAT_SORT'{
withName: 'GERMLINE_VCFS_CONCAT_SORT' {
ext.prefix = { "${meta.id}.germline" }
ext.when = { params.concatenate_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" },
pattern: "*vcf.gz"
]
}

withName: 'VCFS_NORM_SORT' {
ext.prefix = { "${meta.id}.${meta.variantcaller}.norm" }
ext.when = { params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" },
pattern: "*vcf.gz"
]
}

withName: 'VCFS_NORM' {
ext.args = { [
'--multiallelics -both', //split multiallelic sites into biallelic records and both SNPs and indels should be merged separately into two records
'--rm-dup all' //output only the first instance of a record which is present multiple times
].join(' ') }
ext.when = { params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" }
]
}

withName: 'TABIX_EXT_VCF' {
ext.prefix = { "${input.baseName}" }
ext.when = { params.concatenate_vcfs }
ext.when = { params.concatenate_vcfs || params.normalize_vcfs }
}

withName: 'TABIX_GERMLINE_VCFS_CONCAT_SORT'{
ext.prefix = { "${meta.id}.germline" }
ext.when = { params.concatenate_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" },
pattern: "*.tbi"
]
}

withName: 'TABIX_VCFS_NORM_SORT'{
ext.prefix = { "${meta.id}.${meta.variantcaller}.norm" }
ext.when = { params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" },
pattern: "*.tbi"
]
}
}

Binary file modified docs/images/sarek_subway.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 33b0ba6

Please sign in to comment.