Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added indexcov : finding large INDEL using the BAI index #1613

Merged
merged 37 commits into from
Dec 10, 2024
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
6c70f3e
add indexcov
lindenb Aug 5, 2024
c8b922b
indexcov germline works
lindenb Aug 5, 2024
add7b5f
update README CHANGELOG
lindenb Aug 5, 2024
2b0ab52
merge dev
lindenb Aug 5, 2024
5db7e39
merge into dev
lindenb Aug 5, 2024
5b02760
fix modules
lindenb Aug 5, 2024
ada5543
fix indent
lindenb Aug 5, 2024
c8920bd
fix CHANGELOG
lindenb Aug 5, 2024
892da3d
[automated] Fix code linting
nf-core-bot Aug 6, 2024
09371b6
review for https://github.com/nf-core/sarek/pull/1613
plu1087 Sep 2, 2024
cf2c697
add somatic; add indexcov in usage
lindenb Sep 2, 2024
cdd529d
multiqc for indexcov
lindenb Sep 3, 2024
04086cd
Update docs/output.md
lindenb Sep 3, 2024
6bf0440
Update README.md
lindenb Sep 3, 2024
6f0db0b
Update README.md
lindenb Sep 3, 2024
8d69004
Update docs/usage.md
lindenb Sep 3, 2024
0aa03a4
Update nextflow_schema.json
lindenb Sep 3, 2024
1224f98
Update docs/usage.md
lindenb Sep 3, 2024
4740af8
Merge branch 'dev' into pl_goleft
maxulysse Sep 13, 2024
d1cfad7
Update modules/local/samtools/reindex_bam/environment.yml
lindenb Sep 13, 2024
264c948
apply https://github.com/nf-core/sarek/pull/1613#discussion_r1758400272
lindenb Sep 13, 2024
1c3cabd
Merge branch 'dev' into pl_goleft
maxulysse Oct 4, 2024
2e9581c
Merge branch 'dev' into pl_goleft
FriederikeHanssen Dec 10, 2024
ea11058
update metromap
FriederikeHanssen Dec 10, 2024
041fd87
add paired
FriederikeHanssen Dec 10, 2024
c50e779
Update docs/usage.md
FriederikeHanssen Dec 10, 2024
30d5520
Merge branch 'dev' into pl_goleft
FriederikeHanssen Dec 10, 2024
eb62596
[automated] Fix code linting
nf-core-bot Dec 10, 2024
2141330
fix tests, conversion was removed
FriederikeHanssen Dec 10, 2024
532902a
Merge remote-tracking branch 'origin/pl_goleft' into pl_goleft
FriederikeHanssen Dec 10, 2024
b18c60a
Update docs/usage.md
FriederikeHanssen Dec 10, 2024
b5b2ae2
Update modules/local/samtools/reindex_bam/environment.yml
FriederikeHanssen Dec 10, 2024
a172fbe
bump module version
FriederikeHanssen Dec 10, 2024
d28c92f
Merge remote-tracking branch 'origin/pl_goleft' into pl_goleft
FriederikeHanssen Dec 10, 2024
0b144ff
fix naming
FriederikeHanssen Dec 10, 2024
78cfeca
fix subway paths
FriederikeHanssen Dec 10, 2024
6dc5f99
fix channel name
FriederikeHanssen Dec 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- [1613](https://github.com/nf-core/sarek/pull/1613) - add indexcov
- [1638](https://github.com/nf-core/sarek/pull/1638) - Added additional documentation detailing ASCAT WES usage.
- [1640](https://github.com/nf-core/sarek/pull/1620) - Add `lofreq` as a tumor-only variant caller
- [1642](https://github.com/nf-core/sarek/pull/1642) - Back to dev
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ Depending on the options and samples provided, the pipeline can currently perfor
- `freebayes`
- `GATK HaplotypeCaller`
- `Manta`
- `indexcov`
- `mpileup`
- `MSIsensor-pro`
- `Mutect2`
Expand Down Expand Up @@ -171,6 +172,7 @@ We thank the following people for their extensive assistance in the development
- [pallolason](https://github.com/pallolason)
- [Paul Cantalupo](https://github.com/pcantalupo)
- [Phil Ewels](https://github.com/ewels)
- [Pierre Lindenbaum](https://github.com/lindenb)
- [Sabrina Krakau](https://github.com/skrakau)
- [Sam Minot](https://github.com/sminot)
- [Sebastian-D](https://github.com/Sebastian-D)
Expand Down
21 changes: 21 additions & 0 deletions conf/modules/indexcov.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

// INDEXCOV

process {
if (params.tools && params.tools.split(',').contains('indexcov')) {

withName: 'SAMTOOLS_REINDEX_BAM' {
ext.args = { ' -F 3844 -q 30 ' } // high mapq , primary read paired properly mapped
}

withName: 'GOLEFT_INDEXCOV' {
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/indexcov/" }
]

}

}

}
25 changes: 25 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [Strelka](#strelka)
- [Lofreq](#lofreq)
- [Structural Variants](#structural-variants)
- [Indexcov](#indexcov)
- [Manta](#manta)
- [TIDDIT](#tiddit)
- [Sample heterogeneity, ploidy and CNVs](#sample-heterogeneity-ploidy-and-cnvs)
Expand Down Expand Up @@ -592,6 +593,30 @@ For further downstream analysis, take a look [here](https://github.com/Illumina/

### Structural Variants

#### indexcov

[indexcov](https://github.com/brentp/goleft/tree/master/indexcov) quickly estimate coverage from a whole-genome bam or cram index.
A bam index has 16KB resolution and it is used as a coverage estimate .
The output is scaled to around 1. So a long stretch with values of 1.5 would be a heterozygous duplication. This is useful as a quick QC to get coverage values across the genome.

**Output directory: `{outdir}/variantcalling/indexcov/`**

In addition to the interactive HTML files, `indexcov` outputs a number of text files:

- `<sample>-indexcov.ped`: a .ped/.fam file with the inferred sex in the appropriate column if the sex chromosomes were found.
the CNX and CNY columns indicating the floating-point estimate of copy-number for those chromosomes.
`bins.out`: how many bins had a coverage value outside of (0.85, 1.15). high values can indicate high-bias samples.
`bins.lo`: number of bins with value < 0.15. high values indicate missing data.
`bins.hi`: number of bins with value > 1.15.
`bins.in`: number of bins with value inside of (0.85, 1.15)
`p.out`: `bins.out/bins.in`
`PC1...PC5`: PCA projections calculated with depth of autosomes.

- `<sample>-indexcov.roc`: tab-delimited columns of chrom, scaled coverage cutoff, and $n_samples columns where each indicates the
proportion of 16KB blocks at or above that scaled coverage value.
- `<sample>-indexcov.bed.gz`: a bed file with columns of chrom, start, end, and a column per sample where the values indicate there
scaled coverage for that sample in that 16KB chunk.

#### Manta

[Manta](https://github.com/Illumina/manta) calls structural variants (SVs) and indels from mapped paired-end sequencing reads.
Expand Down
Binary file added docs/sarek_subway.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be in the image folder

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading