Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add msisensor (#95) #163

Merged
merged 17 commits into from
Mar 20, 2020
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
tool: [Haplotypecaller, Freebayes, Manta, mpileup, Strelka, TIDDIT]
tool: [Haplotypecaller, Freebayes, Manta, mpileup, Strelka, TIDDIT, msisensor]
steps:
- uses: actions/checkout@v2
- name: Install Nextflow
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
- [#141](https://github.com/nf-core/sarek/pull/141) - Add containers for `WBcel235`
- [#150](https://github.com/nf-core/sarek/pull/150), [#151](https://github.com/nf-core/sarek/pull/151), [#154](https://github.com/nf-core/sarek/pull/154) - Add AWS mega test GitHub Actions
- [#158](https://github.com/nf-core/sarek/pull/158) - Added `ggplot2` v `3.3.0`
- [#163](https://github.com/nf-core/sarek/pull/163) - Add [msisensor](https://github.com/ding-lab/msisensor) in tools and container

### `Changed`

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ Helpful contributors:
* [gulfshores](https://github.com/gulfshores)
* [pallolason](https://github.com/pallolason)
* [silviamorins](https://github.com/silviamorins)
* [David Mas-Ponte](https://github.com/davidmasp)

## Contributions & Support

Expand Down
2 changes: 2 additions & 0 deletions bin/scrape_software_versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
'GATK': ['v_gatk.txt', r"Version:(\S+)"],
'htslib': ['v_samtools.txt', r"htslib (\S+)"],
'Manta': ['v_manta.txt', r"([0-9.]+)"],
'msisensor': ["v_msisensor.txt", r"Version: v(\S+)"],
'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"],
'Nextflow': ['v_nextflow.txt', r"(\S+)"],
'nf-core/sarek': ['v_pipeline.txt', r"(\S+)"],
Expand All @@ -38,6 +39,7 @@
results['GATK'] = '<span style="color:#999999;\">N/A</span>'
results['htslib'] = '<span style="color:#999999;\">N/A</span>'
results['Manta'] = '<span style="color:#999999;\">N/A</span>'
results['msisensor'] = '<span style="color:#999999;\">N/A</span>'
results['MultiQC'] = '<span style="color:#999999;\">N/A</span>'
results['Qualimap'] = '<span style="color:#999999;\">N/A</span>'
results['R'] = '<span style="color:#999999;\">N/A</span>'
Expand Down
1 change: 1 addition & 0 deletions docs/containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ For annotation, the main container can be used, but the cache has to be download
- Contain **[ggplot2](https://github.com/tidyverse/ggplot2)** 3.3.0
- Contain **[HTSlib](https://github.com/samtools/htslib)** 1.9
- Contain **[Manta](https://github.com/Illumina/manta)** 1.6.0
- Contain **[msisensor](https://github.com/ding-lab/msisensor)** 0.5
- Contain **[MultiQC](https://github.com/ewels/MultiQC/)** 1.8
- Contain **[Qualimap](http://qualimap.bioinfo.cipf.es)** 2.2.2d
- Contain **[samtools](https://github.com/samtools/samtools)** 1.9
Expand Down
32 changes: 32 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [ConvertAlleleCounts](#convertallelecounts)
- [ASCAT](#ascat)
- [Control-FREEC](#control-freec)
- [MSI status](#msi-status)
- [msisensor](#msisensor)
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
- [Variant annotation](#variant-annotation)
- [snpEff](#snpeff)
- [VEP](#vep)
Expand Down Expand Up @@ -424,6 +426,36 @@ For a Tumor/Normal pair only:
- `[TUMORSAMPLE].pileup.gz_BAF.txt` and `[NORMALSAMPLE].pileup.gz_BAF.txt`
- file with beta allele frequencies for each possibly heterozygous SNP position

### MSI status

[Microsatellite instability](https://en.wikipedia.org/wiki/Microsatellite_instability)
is a genetic condition associated to deficienceies in the
mismatch repair (MMR) system which causes a tendency to accumulate a high
number of mutations (SNVs and indels).

#### msisensor
maxulysse marked this conversation as resolved.
Show resolved Hide resolved

[msisensor](https://github.com/ding-lab/msisensor) is a tool to detect the MSI
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
status of a tumor scaning the length of the microsatellite regions. An altered
distribution of microsatellite length is associated to a missed replication
slippage which would be corrected under normal MMR conditions. It requires
davidmasp marked this conversation as resolved.
Show resolved Hide resolved
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
a normal sample for each tumour to differentiate the somatic and germline
cases.

For further reading see the [msisensor paper](https://www.ncbi.nlm.nih.gov/pubmed/24371154).
maxulysse marked this conversation as resolved.
Show resolved Hide resolved

For a Tumor/Normal pair only:
**Output directory: `results/MSI/[TUMORSAMPLE]_vs_[NORMALSAMPLE]/msisensor`**
maxulysse marked this conversation as resolved.
Show resolved Hide resolved

- `[TUMORSAMPLE]_vs_[NORMALSAMPLE]`_msisensor
- MSI score output, contains information about the number of somatic sites.
- `[TUMORSAMPLE]_vs_[NORMALSAMPLE]`_msisensor_dis
- The normal and tumor length distribution for each microsatellite position.
- `[TUMORSAMPLE]_vs_[NORMALSAMPLE]`_msisensor_germline
- somatic sites detected
- `[TUMORSAMPLE]_vs_[NORMALSAMPLE]`_msisensor_somatic
- germ line sites detected

## Variant annotation

This directory contains results from the final annotation steps: two software are used for annotation, [snpEff](http://snpeff.sourceforge.net/) and [VEP](https://www.ensembl.org/info/docs/tools/vep/index.html).
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dependencies:
- bioconda::genesplicer=1.0
- bioconda::htslib=1.9
- bioconda::manta=1.6.0
- bioconda::msisensor=0.5
davidmasp marked this conversation as resolved.
Show resolved Hide resolved
- bioconda::multiqc=1.8
- bioconda::qualimap=2.2.2d
- bioconda::samtools=1.9
Expand Down
66 changes: 63 additions & 3 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -618,6 +618,7 @@ process Get_software_versions {
trim_galore -v &> v_trim_galore.txt 2>&1 || true
vcftools --version &> v_vcftools.txt 2>&1 || true
vep --help &> v_vep.txt 2>&1 || true
msisensor &> v_msisensor.txt 2>&1 || true
davidmasp marked this conversation as resolved.
Show resolved Hide resolved

scrape_software_versions.py &> software_versions_mqc.yaml
"""
Expand Down Expand Up @@ -2069,8 +2070,8 @@ pairBam = bamNormal.cross(bamTumor).map {

pairBam = pairBam.dump(tag:'BAM Somatic Pair')

// Manta, Strelka, Mutect2
(pairBamManta, pairBamStrelka, pairBamStrelkaBP, pairBamCalculateContamination, pairBamFilterMutect2, pairBamTNscope, pairBam) = pairBam.into(7)
// Manta, Strelka, Mutect2, MSIsensor
(pairBamManta, pairBamStrelka, pairBamStrelkaBP, pairBamCalculateContamination, pairBamFilterMutect2, pairBamTNscope, pairBamMsisensor, pairBam) = pairBam.into(8)

intervalPairBam = pairBam.spread(bedIntervals)

Expand Down Expand Up @@ -2605,6 +2606,64 @@ process StrelkaBP {

vcfStrelkaBP = vcfStrelkaBP.dump(tag:'Strelka BP')

// STEP MSISENSOR.1 - SCAN

// Scan reference genome for microsattelites
process msisensorScan {
label 'cpus_1'
label 'memory_max'
// memory '20 GB'

tag {fasta}

input:
file(fasta) from ch_fasta
file(fastaFai) from ch_fai

output:
file "microsatellites.list" into msi_scan_ch

when: 'msisensor' in tools

script:
"""
msisensor scan -d ${fasta} -o microsatellites.list
"""
}

// STEP MSISENSOR.2 - SCORE

// Score the normal vs somatic pair of bams

process msisensor {
label 'cpus_4'
label 'memory_max'
// memory '10 GB'

tag {idSampleTumor + "_vs_" + idSampleNormal}

publishDir "${params.outdir}/MSI/${idSampleTumor}_vs_${idSampleNormal}/msisensor", mode: params.publishDirMode
davidmasp marked this conversation as resolved.
Show resolved Hide resolved

input:
set idPatient, idSampleNormal, file(bamNormal), file(baiNormal), idSampleTumor, file(bamTumor), file(baiTumor) from pairBamMsisensor
file msiSites from msi_scan_ch

output:
set val("Msisensor"), idPatient, file("${idSampleTumor}_vs_${idSampleNormal}_msisensor"), file("${idSampleTumor}_vs_${idSampleNormal}_msisensor_dis"), file("${idSampleTumor}_vs_${idSampleNormal}_msisensor_germline"), file("${idSampleTumor}_vs_${idSampleNormal}_msisensor_somatic") into msisensor_out_ch
davidmasp marked this conversation as resolved.
Show resolved Hide resolved

when:
when: 'msisensor' in tools

script:
"""
msisensor msi -d ${msiSites} \
-b 4 \
-n ${bamNormal} \
-t ${bamTumor} \
-o ${idSampleTumor}_vs_${idSampleNormal}_msisensor
"""
}

// STEP ASCAT.1 - ALLELECOUNTER

// Run commands and code from Malin Larsson
Expand Down Expand Up @@ -3595,7 +3654,8 @@ def defineToolList() {
'strelka',
'tiddit',
'tnscope',
'vep'
'vep',
'msisensor'
]
}

Expand Down