Skip to content

Commit

Permalink
Merge pull request #489 from nf-core/doc-updates
Browse files Browse the repository at this point in the history
Doc updates
  • Loading branch information
sateeshperi authored Dec 13, 2024
2 parents be9f86a + 689f205 commit 46bacc0
Show file tree
Hide file tree
Showing 14 changed files with 187 additions and 56 deletions.
5 changes: 3 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@
- 🔧 reorg individual configs to `conf/modules/` named configs [#459](https://github.com/nf-core/methylseq/pull/469)
- 🔧 `run_preseq` param + skip preseq/lcextrap module by default [#458](https://github.com/nf-core/methylseq/pull/470)
- 🔧 `run_qualimap` param + skip qualimap module by default [#367](https://github.com/nf-core/methylseq/pull/471)
- 🔧 Raised Nextflow version requirement to `24.10.2` in CI
- 🔧 Raised Nextflow version requirement to `24.10.2`
- 🔧 Update the CI for pipeline-level bwameth GPU Tests [#481](https://github.com/nf-core/methylseq/pull/478)
- 🔧 Update the CI for pipeline-level bwameth GPU Tests [#477](https://github.com/nf-core/methylseq/pull/486)
- 🔧 create a test for samplesheet with technical replicates [#477](https://github.com/nf-core/methylseq/pull/486)
- 🔧 Update usage and output docs [#487](https://github.com/nf-core/methylseq/pull/489)

## [v2.7.1](https://github.com/nf-core/methylseq/releases/tag/2.7.1) - [2024-10-27]

Expand Down
13 changes: 6 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ On release, automated continuous integration tests run the pipeline on a full-si

The pipeline allows you to choose between running either [Bismark](https://github.com/FelixKrueger/Bismark) or [bwa-meth](https://github.com/brentp/bwa-meth) / [MethylDackel](https://github.com/dpryan79/methyldackel).

Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat` or `--aligner bwameth`.
Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat` or `--aligner bwameth`. For higher performance, the pipeline can leverage the [Parabricks implementation of bwa-meth (fq2bammeth)](https://docs.nvidia.com/clara/parabricks/latest/documentation/tooldocs/man_fq2bam_meth.html), which implements the baseline tool `bwa-meth` in a performant method using fq2bam (BWA-MEM + GATK) as a backend for processing on GPU. To use this option, include the `--use_gpu` flag along with `--aligner bwameth`.

| Step | Bismark workflow | bwa-meth workflow |
| -------------------------------------------- | ------------------------ | --------------------- |
Expand All @@ -44,8 +44,8 @@ Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for
| Extract methylation calls | Bismark | MethylDackel |
| Sample report | Bismark | - |
| Summary Report | Bismark | - |
| Alignment QC | Qualimap | Qualimap |
| Sample complexity | Preseq | Preseq |
| Alignment QC | Qualimap _(optional)_ | Qualimap _(optional)_ |
| Sample complexity | Preseq _(optional)_ | Preseq _(optional)_ |
| Project Report | MultiQC | MultiQC |

## Usage
Expand All @@ -65,9 +65,9 @@ SRR389222_sub3,https://github.com/nf-core/test-datasets/raw/methylseq/testdata/S
Ecoli_10K_methylated,https://github.com/nf-core/test-datasets/raw/methylseq/testdata/Ecoli_10K_methylated_R1.fastq.gz,https://github.com/nf-core/test-datasets/raw/methylseq/testdata/Ecoli_10K_methylated_R2.fastq.gz,
```

Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
> Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
Now, you can run the pipeline using:
Now, you can run the pipeline using default parameters as:

```bash
nextflow run nf-core/methylseq --input samplesheet.csv --outdir <OUTDIR> --genome GRCh37 -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
Expand All @@ -81,8 +81,7 @@ For more details and further functionality, please refer to the [usage documenta
## Pipeline output

To see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/methylseq/results) tab on the nf-core website pipeline page.
For more details about the output files and reports, please refer to the
[output documentation](https://nf-co.re/methylseq/output).
For more details about the output files and reports, please refer to the [output documentation](https://nf-co.re/methylseq/output).

## Credits

Expand Down
4 changes: 0 additions & 4 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,5 @@ process {
memory: '15.GB',
time: '1.h'
]

withName: PRESEQ_LCEXTRAP {
errorStrategy = 'ignore'
}
}

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/2.8.0_dag_methylseq.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/mqc_fastqc_adapter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/mqc_fastqc_counts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/mqc_fastqc_quality.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
88 changes: 74 additions & 14 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@

## Introduction

This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline.
This document describes the output produced by the methylseq pipeline.

Note that nf-core/methylseq contains two workflows - one for Bismark, one for bwa-meth. The results files produced will vary depending on which variant is run.
Most of the plots are taken from the MultiQC report, which summarizes results at the end of the pipeline.

> NOTE: nf-core/methylseq contains two workflows - one for Bismark, one for bwa-meth. The results files produced will vary depending on which variant is run.
The output directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

Expand All @@ -18,11 +20,69 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [Deduplication](#deduplication) - Deduplicating reads
- [Methylation Extraction](#methylation-extraction) - Calling cytosine methylation steps
- [Bismark Reports](#bismark-reports) - Single-sample and summary analysis reports
- [Qualimap](#qualimap) - Tool for genome alignments QC
- [Preseq](#preseq) - Tool for estimating sample complexity
- [Qualimap](#qualimap) - Tool for genome alignments QC [OPTIONAL]
- [Preseq](#preseq) - Tool for estimating sample complexity [OPTIONAL]
- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution

### Output Directories

#### Bismark

```
bismark/
├── bismark
│ ├── alignments
│ ├── deduplicated
│ ├── methylation_calls
│ ├── reports
│ └── summary
├── fastqc
│ ├── Ecoli_10K_methylated_1_fastqc.html
│ ├── Ecoli_10K_methylated_2_fastqc.html
│ └── zips
├── multiqc
│ └── bismark
├── pipeline_info
│ ├── execution_report_2024-12-13_05-38-05.html
│ ├── execution_timeline_2024-12-13_05-38-05.html
│ ├── execution_trace_2024-12-13_05-38-05.txt
│ ├── nf_core_pipeline_software_mqc_versions.yml
│ ├── params_2024-12-13_05-38-14.json
│ └── pipeline_dag_2024-12-13_05-38-05.html
└── trimgalore
├── fastqc
└── logs
```

#### bwa-meth

```
bwameth/
├── bwameth
│ ├── alignments
│ └── deduplicated
├── fastqc
│ ├── Ecoli_10K_methylated_1_fastqc.html
│ ├── Ecoli_10K_methylated_2_fastqc.html
│ └── zips
├── methyldackel
│ ├── Ecoli_10K_methylated.markdup.sorted_CpG.bedGraph
│ └── mbias
├── multiqc
│ └── bwameth
├── pipeline_info
│ ├── execution_report_2024-12-13_05-36-34.html
│ ├── execution_timeline_2024-12-13_05-36-34.html
│ ├── execution_trace_2024-12-13_05-36-34.txt
│ ├── nf_core_pipeline_software_mqc_versions.yml
│ ├── params_2024-12-13_05-36-43.json
│ └── pipeline_dag_2024-12-13_05-36-34.html
└── trimgalore
├── fastqc
└── logs
```

### FastQC

<details markdown="1">
Expand Down Expand Up @@ -54,7 +114,7 @@ The nf-core/methylseq pipeline uses [TrimGalore!](http://www.bioinformatics.babr

MultiQC reports the percentage of bases removed by Cutadapt in the _General Statistics_ table, along with a line plot showing where reads were trimmed.

**Output directory: `results/trim_galore`**
**Output directory: `results/trimgalore`**

Contains FastQ files with quality and adapter trimmed reads for each sample, along with a log file describing the trimming.

Expand All @@ -63,7 +123,7 @@ Contains FastQ files with quality and adapter trimmed reads for each sample, alo
- **NB:** Only saved if `--save_trimmed` has been specified.
- `logs/sample_val_1.fq.gz_trimming_report.txt`
- Trimming report (describes which parameters that were used)
- `FastQC/sample_val_1_fastqc.zip`
- `fastQC/sample_val_1_fastqc.zip`
- FastQC report for trimmed reads

Single-end data will have slightly different file names and only one FastQ file per sample.
Expand All @@ -72,7 +132,7 @@ Single-end data will have slightly different file names and only one FastQ file

Bismark and bwa-meth convert all Cytosines contained within the sequenced reads to Thymine _in-silico_ and then align against a three-letter reference genome. This method avoids methylation-specific alignment bias. The alignment produces a BAM file of genomic alignments.

**Bismark output directory: `results/bismark_alignments/`**
**Bismark output directory: `results/bismark/alignments/`**
_Note that bismark can use either use Bowtie2 (default) or HISAT2 as alignment tool and the output file names will not differ between the options._

- `sample.bam`
Expand All @@ -84,7 +144,7 @@ _Note that bismark can use either use Bowtie2 (default) or HISAT2 as alignment t
- Unmapped reads in FastQ format.
- Only saved if `--unmapped` specified when running the pipeline.

**bwa-meth output directory: `results/bwa-mem_alignments/`**
**bwa-meth output directory: `results/bwameth/alignments/`**

- `sample.bam`
- Aligned reads in BAM format.
Expand All @@ -95,23 +155,23 @@ _Note that bismark can use either use Bowtie2 (default) or HISAT2 as alignment t
- `sample.sorted.bam.bai`
- Index of sorted BAM file
- **NB:** Only saved if `--save_align_intermeds`, `--skip_deduplication` or `--rrbs` is specified when running the pipeline.
- `logs/sample_flagstat.txt`
- `logs/samtools_stats/sample_flagstat.txt`
- Summary file describing the number of reads which aligned in different ways.
- `logs/sample_stats.txt`
- `logs/samtools_stats/sample_stats.txt`
- Summary file giving lots of metrics about the aligned BAM file.

### Deduplication

This step removes alignments with identical mapping position to avoid technical duplication in the results. Note that it is skipped if `--save_align_intermeds`, `--skip_deduplication` or `--rrbs` is specified when running the pipeline.

**Bismark output directory: `results/bismark_deduplicated/`**
**Bismark output directory: `results/bismark/deduplicated/`**

- `deduplicated.bam`
- BAM file with only unique alignments.
- `logs/deduplication_report.txt`
- Log file giving summary statistics about deduplication.

**bwa-meth output directory: `results/bwa-mem_markDuplicates/`**
**bwa-meth output directory: `results/bwameth/deduplicated/`**

> **NB:** The bwa-meth step doesn't remove duplicate reads from the BAM file, it just labels them.
Expand All @@ -135,7 +195,7 @@ Filename abbreviations stand for the following reference alignment strands:
- `CTOT` - complementary to original top strand
- `CTOB` - complementary to original bottom strand

**Bismark output directory: `results/bismark_methylation_calls/`**
**Bismark output directory: `results/bismark/methylation_calls/`**

> **NB:** `CTOT` and `CTOB` are not aligned unless `--non_directional` specified.
Expand All @@ -150,7 +210,7 @@ Filename abbreviations stand for the following reference alignment strands:
- `logs/sample_splitting_report.txt`
- Log file giving summary statistics about methylation extraction.

**bwa-meth workflow output directory: `results/MethylDackel/`**
**bwa-meth workflow output directory: `results/methyldackel/`**

- `sample.bedGraph`
- Methylation statuses in [bedGraph](http://genome.ucsc.edu/goldenPath/help/bedgraph.html) format.
Expand Down
Loading

0 comments on commit 46bacc0

Please sign in to comment.