nf-core · LorenzoS96 · Jan 24, 2025 · Jan 23, 2025 · Jan 23, 2025 · Jan 23, 2025
diff --git a/docs/usage/DEanalysis/de_rstudio.md → ...rential_expression_analysis/de_rstudio.md b/docs/usage/DEanalysis/de_rstudio.md → ...rential_expression_analysis/de_rstudio.md
@@ -1,5 +1,6 @@
 ---
 order: 4
+shortTitle: RStudio
 ---
 
 # Differential Analysis with DESeq2
@@ -33,9 +34,7 @@ As in all analysis, firstly we need to create a new project:
 
 2. Select **New Directory**, **New Project**, name the project as shown below and click on **Create Project**;
 
-<figure markdown="span">
-  ![r_project](./img/project_R.png){ width="400" }
-</figure>
+![r_project](../differential_expression_analysis/img/project_R.png)
 
 3. The new project will be automatically opened in RStudio.
 
@@ -48,9 +47,7 @@ To store our results in an organized way, we will create a folder named **de_res
 
 and save the file as **de_script.R**. From now on, each command described in the tutorial can be added to your script. The resulting working directory should look like this:
 
-<figure markdown="span">
-  ![work_dir](./img/workdir_RStudio.png){ width="600" }
-</figure>
+![work_dir](../differential_expression_analysis/img/workdir_RStudio.png)
 
 The analysis requires several R packages. To utilise them, we need to load the following libraries:
 
@@ -162,9 +159,7 @@ design(dds_new) # to check the design formula
 
 Comparing the structure of the newly created dds (`dds_new`) with the one automatically produced by the pipeline (`dds`), we can observe the differences:
 
-<figure markdown="span">
-  ![comparison_dds](./img/dds_comparison.png){ width="400" }
-</figure>
+![comparison_dds](../differential_expression_analysis/img/dds_comparison.png)
 
 Before running the different steps of the analysis, a good practice consists in pre-filtering the genes to remove those with very low counts. This is useful to improve computional efficiency and enhance interpretability. In general, it is reasonable to keep only genes with a sum counts of at least 10 for a minimal number of 3 samples:
 
@@ -438,7 +433,7 @@ plotCounts(dds_final, gene = "ENSG00000142192")
 dev.off()
 ```
 
-**heatmap**: plot of the normalised counts for all the significant genes obtained with the `pheatmap()` function. The heatmap provides insights into genes and sample relationships that may not be apparent from individual gene plots alone.
+- **heatmap**: plot of the normalised counts for all the significant genes obtained with the `pheatmap()` function. The heatmap provides insights into genes and sample relationships that may not be apparent from individual gene plots alone.
 
 ```r
 #### Heatmap ####

diff --git a/docs/usage/DEanalysis/img/DESeq_function.png → ...xpression_analysis/img/DESeq_function.png b/docs/usage/DEanalysis/img/DESeq_function.png → ...xpression_analysis/img/DESeq_function.png
diff --git a/...sage/DEanalysis/img/Excalidraw_RNAseq.png → ...ession_analysis/img/Excalidraw_RNAseq.png b/...sage/DEanalysis/img/Excalidraw_RNAseq.png → ...ession_analysis/img/Excalidraw_RNAseq.png
diff --git a/docs/usage/DEanalysis/img/MA_plot.png → ...ntial_expression_analysis/img/MA_plot.png b/docs/usage/DEanalysis/img/MA_plot.png → ...ntial_expression_analysis/img/MA_plot.png
diff --git a/...Eanalysis/img/RNA_seq_scheme_tutorial.png → ..._analysis/img/RNA_seq_scheme_tutorial.png b/...Eanalysis/img/RNA_seq_scheme_tutorial.png → ..._analysis/img/RNA_seq_scheme_tutorial.png
diff --git a/...age/DEanalysis/img/count_distribution.png → ...ssion_analysis/img/count_distribution.png b/...age/DEanalysis/img/count_distribution.png → ...ssion_analysis/img/count_distribution.png
diff --git a/docs/usage/DEanalysis/img/dds_comparison.png → ...xpression_analysis/img/dds_comparison.png b/docs/usage/DEanalysis/img/dds_comparison.png → ...xpression_analysis/img/dds_comparison.png
diff --git a/...e/DEanalysis/img/dispersion_estimates.png → ...ion_analysis/img/dispersion_estimates.png b/...e/DEanalysis/img/dispersion_estimates.png → ...ion_analysis/img/dispersion_estimates.png
diff --git a/.../usage/DEanalysis/img/enrichment_plot.png → ...pression_analysis/img/enrichment_plot.png b/.../usage/DEanalysis/img/enrichment_plot.png → ...pression_analysis/img/enrichment_plot.png
diff --git a/...usage/DEanalysis/img/heatmap_de_genes.png → ...ression_analysis/img/heatmap_de_genes.png b/...usage/DEanalysis/img/heatmap_de_genes.png → ...ression_analysis/img/heatmap_de_genes.png
diff --git a/...Eanalysis/img/hierarchical_clustering.png → ..._analysis/img/hierarchical_clustering.png b/...Eanalysis/img/hierarchical_clustering.png → ..._analysis/img/hierarchical_clustering.png
diff --git a/...sis/img/nf-core-rnaseq_metro_map_grey.png → ...sis/img/nf-core-rnaseq_metro_map_grey.png b/...sis/img/nf-core-rnaseq_metro_map_grey.png → ...sis/img/nf-core-rnaseq_metro_map_grey.png
diff --git a/docs/usage/DEanalysis/img/overdispersion.png → ...xpression_analysis/img/overdispersion.png b/docs/usage/DEanalysis/img/overdispersion.png → ...xpression_analysis/img/overdispersion.png
diff --git a/docs/usage/DEanalysis/img/pca_plot.png → ...tial_expression_analysis/img/pca_plot.png b/docs/usage/DEanalysis/img/pca_plot.png → ...tial_expression_analysis/img/pca_plot.png
diff --git a/docs/usage/DEanalysis/img/plotCounts.png → ...al_expression_analysis/img/plotCounts.png b/docs/usage/DEanalysis/img/plotCounts.png → ...al_expression_analysis/img/plotCounts.png
diff --git a/docs/usage/DEanalysis/img/project_R.png → ...ial_expression_analysis/img/project_R.png b/docs/usage/DEanalysis/img/project_R.png → ...ial_expression_analysis/img/project_R.png
diff --git a/docs/usage/DEanalysis/img/volcanoplot.png → ...l_expression_analysis/img/volcanoplot.png b/docs/usage/DEanalysis/img/volcanoplot.png → ...l_expression_analysis/img/volcanoplot.png
diff --git a/.../usage/DEanalysis/img/workdir_RStudio.png → ...pression_analysis/img/workdir_RStudio.png b/.../usage/DEanalysis/img/workdir_RStudio.png → ...pression_analysis/img/workdir_RStudio.png
diff --git a/docs/usage/DEanalysis/interpretation.md → ...ial_expression_analysis/interpretation.md b/docs/usage/DEanalysis/interpretation.md → ...ial_expression_analysis/interpretation.md
@@ -14,17 +14,13 @@ The results illustrated in this section might show slight variations compared to
 
 The first plot we will examine is the Principal Component Analysis (PCA) plot. Since we're working with simulated data, our metadata is relatively simple, consisting of just three variables: `sample`, `condition`, and `replica`. In a typical RNA-seq experiment, however, metadata can be complex and encompass a wide range of variables that could contribute to sample variation, such as sex, age, and developmental stage.
 
-<figure markdown="span">
-  ![pca](./img/pca_plot.png){ width="400" }
-</figure>
+![pca](../differential_expression_analysis/img/pca_plot.png)
 
 By plotting the PCA on the PC1 and PC2 axes, using `condition` as the main variable of interest, we can quickly identify the primary source of variation in our data. By accounting for this variation in our design model, we should be able to detect more differentially expressed genes related to `condition`. When working with real data, it's often useful to plot the data using different variables to explore how much variation is explained by the first two PCs. Depending on the results, it may be informative to examine variation on additional PC axes, such as PC3 and PC4, to gain a more comprehensive understanding of the data.
 
 Next, we will examine the hierarchical clustering plot to explore the relationships between samples based on their gene expression profiles. The heatmap is organized such that samples with similar expression profiles are close to each other, allowing us to identify patterns in the data.
 
-<figure markdown="span">
-  ![cluster](./img/hierarchical_clustering.png){ width="400" }
-</figure>
+![cluster](../differential_expression_analysis/img/hierarchical_clustering.png)
 
 Remember that to create this plot, we utilized the `dist()` function, so in the legend on the right, a value of 0 corresponds to high correlation, while a value of 5 corresponds to very low correlation. Similar to PCA, we can see that samples tend to cluster together according to `condition`, indeed we can observe a high degree of correlation between the three control samples and between the three treated samples.
 
@@ -35,11 +31,9 @@ Overall, the integration of these plots suggests that we are working with high-q
 In this part of the tutorial, we will examine plots that are generated after the differential expression analysis. These plots are not quality control plots, but rather plots that help us to interpret the results.
 After running the `results()` function, a good way to start to have an idea about the results is to look at the MA plot.
 
-<figure markdown="span">
-  ![ma_plot](./img/MA_plot.png){ width="500" }
-</figure>
+![ma_plot](../differential_expression_analysis/img/MA_plot.png)
 
-By default, genes are coloured in blue if the padj is less than 0.1 and the log2 fold change greater than or less than 0. Genes that fall outside the plotting region are represented as open triangles. At this stage, we have not yet applied a filter to select only significant DE genes, which we define as those with a padj value less than 0.5 and a log2 fold change of at least 1 or -1.
+By default, genes are coloured in blue if the padj is less than 0.1 and the log2 fold change greater than or less than 0. Genes that fall outside the plotting region are represented as open triangles. At this stage, we have not yet applied a filter to select only significant DE genes, which we define as those with a padj value less than 0.05 and a log2 fold change of at least 1 or -1.
 
 After filtering our genes of interest according to our threshold, let's have a look to our significatnt genes:
 
@@ -54,35 +48,27 @@ ENSG00000156282     481.7624        1.095272           0.2969594      3.688289
 
 To gain a comprehensive overview of the transcriptional profile, the volcano plot represents a highly informative tool.
 
-<figure markdown="span">
-  ![volcano_plot](./img/volcanoplot.png){ width="400"}
-</figure>
+![volcano_plot](../differential_expression_analysis/img/volcanoplot.png)
 
 The treatment induced differential expression in five genes: one downregulated and four upregulated. This plot visually represents the numerical results reported in the table above.
 
 After the identification of DE genes, it's informative to visualise the expression of specific genes of interest. The `plotCounts()` function applied directly on the `dds` object allows us to examine individual gene expression profiles without accessing the full `res` object.
 
-<figure markdown="span">
-  ![counts](./img/plotCounts.png){ width="400" }
-</figure>
+![counts](../differential_expression_analysis/img/plotCounts.png)
 
 In our example, post-treatment, we observe a significant increase in the expression of the _ENSG00000142192_ gene, highlighting its responsiveness to the experimental conditions.
 
 Finally, we can create a heatmap using the normalised expression counts of DE genes. The resulting heatmap visualises how the expression of significant genes varies across samples. Each row represents a gene, and each column represents a sample. The color intensity in the heatmap reflects the normalised expression levels: red colors indicate higher expression, while blue colors indicate lower expression.
 
-<figure markdown="span">
-  ![heatmap](./img/heatmap_de_genes.png){ width="400" }
-</figure>
+![heatmap](../differential_expression_analysis/img/heatmap_de_genes.png)
 
 By examining the heatmap, we can visually identify the expression patterns of our five significant differentially expressed genes. This visualisation allows us to identify how these genes respond to the treatment. The heatmap provides a clear and intuitive way to explore gene expression dynamics.
 
 ## Over Representation Analysis (ORA)
 
 Finally, we can attempt to assign biological significance to our differentially expressed genes through **Over Representation Analysis (ORA)**. The ORA analysis identifies specific biological pathways, molecular functions and cellular processes, according to the **Gene Ontology (GO)** database, that are enriched within our differentially expressed genes.
 
-<figure markdown="span">
-  ![enrichment](./img/enrichment_plot.png){ width="400" }
-</figure>
+![enrichment](../differential_expression_analysis/img/enrichment_plot.png)
 
 The enrichment analysis reveals a possible involvement of cellular structures and processes, including "clathrin-coated pit", "dendritic spine", "neuron spine" and "endoplasmic reticulum lumen". These terms suggest a focus on cellular transport, structural integrity and protein processing, especially in neural contexts. This pattern points to pathways related to cellular organization and maintenance, possibly playing an important role in the biological condition under study.
 

diff --git a/docs/usage/DEanalysis/index.md → ...ntial_expression_analysis/introduction.md b/docs/usage/DEanalysis/index.md → ...ntial_expression_analysis/introduction.md
@@ -6,7 +6,9 @@ order: 1
 
 These pages are a tutorial workshop for the [Nextflow](https://www.nextflow.io) pipeline [nf-core/rnaseq](https://nf-co.re/rnaseq).
 
-In this workshop, we will recap the application of next generation sequencing to identify differentially expressed genes. You will learn how to use the rnaseq pipeline to carry out this data-intensive workflow efficiently. We will cover topics such as configuration of the pipeline, code execution and data interpretation.
+In this workshop, we will recap the application of next generation sequencing to identify differentially expressed genes.
+You will learn how to use the rnaseq pipeline to carry out this data-intensive workflow efficiently.
+We will cover topics such as configuration of the pipeline, code execution and data interpretation.
 
 Please note that this is not an introductory workshop, and we will assume some basic familiarity with Nextflow.
 
@@ -37,7 +39,9 @@ Now you're all set and can use the following button to launch the service:
 
 ## Credits & Copyright
 
-This training material has been written and completed by [Lorenzo Sola](https://github.com/LorenzoS96), [Francesco Lescai](https://github.com/lescai), and [Mariangela Santorsola](https://github.com/msantorsola) during the [nf-core](https://nf-co.re) Hackathon in Barcellona, 2024. Thank you to [Victoria Cepeda](https://github.com/vcepeda) for her contributions to the tutorial's revision. The tutorial is aimed at anyone who is interested in using nf-core pipelines for their studies or research activities.
+This training material has been written and completed by [Lorenzo Sola](https://github.com/LorenzoS96), [Francesco Lescai](https://github.com/lescai), and [Mariangela Santorsola](https://github.com/msantorsola) during the [nf-core](https://nf-co.re) Hackathon in Barcellona, 2024.
+Thank you to [Victoria Cepeda](https://github.com/vcepeda) for her contributions to the tutorial's revision.
+The tutorial is aimed at anyone who is interested in using nf-core pipelines for their studies or research activities.
 
 The Docker image and Gitpod environment used in this repository have been created by [Seqera](https://seqera.io) but have been made open-source ([CC BY-NC-ND](https://creativecommons.org/licenses/by-nc-nd/4.0/)) for the community.
 

diff --git a/docs/usage/DEanalysis/rnaseq.md → ...ifferential_expression_analysis/rnaseq.md b/docs/usage/DEanalysis/rnaseq.md → ...ifferential_expression_analysis/rnaseq.md
@@ -1,5 +1,6 @@
 ---
 order: 3
+shortTitle: rnaseq pipeline
 ---
 
 # The nf-core/rnaseq pipeline
@@ -10,9 +11,7 @@ In order to carry out a RNA-Seq analysis we will use the nf-core pipeline [rnase
 
 The pipeline is organised following the diffent blocks shown below: pre-processing, traditional alignment (or lightweight alignment) and quantification, post-processing and final QC.
 
-<figure markdown="span">
-  ![metromap](./img/nf-core-rnaseq_metro_map_grey.png){ width="1000"}
-</figure>
+![metromap](../differential_expression_analysis/img/nf-core-rnaseq_metro_map_grey.png)
 
 In each process, the users can choose among a range of different options. Importantly, the users can decide to follow one of the two different routes in the alignment and quantification step:
 
@@ -25,7 +24,7 @@ In each process, the users can choose among a range of different options. Import
 The number of reads and the number of biological replicates are two critical factors that researchers need to carefully consider during the design of a RNA-seq experiment. While it may seem intuitive that having a large number of reads is always desirable, an excessive number can lead to unnecessary costs and computational burdens, without providing significant improvements. Instead, it is often more beneficial to prioritise the number of biological replicates, as it allows to capture the natural biological variation of the data. Biological replicates involve collecting and sequencing RNA from distinct biological samples (e.g., different individuals, tissues, or time points), helping to detect genuine changes in gene expression.
 
 :::warning
-This concept must not be confused with technical replicates that asses the technical variability of the sequencing platform by sequencing the same RNA sample multiple time.
+This concept must not be confused with technical replicates that asses the technical variability of the sequencing platform by sequencing the same RNA sample multiple times.
 :::
 
 To obtain optimal results, it is crucial to balance the number of biological replicates and the sequencing depth. While increasing the depth of sequencing enhances the ability to detect genes with low expression levels, there is a plateau beyond which no further benefits are gained. Statistical power calculations can inform experimental design by estimating the optimal number of reads and replicates required. For instance, this approach helps to establish a suitable log2 fold change threshold for the DE analysis. By incorporating multiple biological replicates into the design and optimizing sequencing depth, researchers can enhance the statistical power of the analysis, reducing the number of false positive results, and increasing the reliability of the findings.