Skip to content

DOLPHIN: Advances Single-cell RNA-seq Analysis Beyond Gene-Level by Integrating Exon-Level Quantification and Junction Reads with Deep Neural Networks

License

Notifications You must be signed in to change notification settings

mcgilldinglab/DOLPHIN

Repository files navigation

Alt text

Overview

Alt text

The advent of single-cell sequencing has revolutionized the study of cellular dynamics, providing unprecedented resolution into the molecular states and heterogeneity of individual cells. However, the rich potential of exon-level information and junction reads within single cells remains underutilized. Conventional gene-count methods overlook critical exon and junction data, limiting the quality of cell representation and downstream analyses such as subpopulation identification and alternative splicing detection. To address this, we introduce DOLPHIN, a deep learning method that integrates exon-level and junction read data, representing genes as graph structures. These graphs are processed by a variational autoencoder to improve cell embeddings. Compared to conventional gene-based methods, DOLPHIN shows superior performance in cell clustering, biomarker discovery, and alternative splicing detection, providing deeper insights into cellular processes. By examining cellular dynamics with enhanced resolution, DOLPHIN detects subtle differences often missed at the gene level, offering new insights into disease mechanisms and potential therapeutic targets.

Key Capabilities of DOLPHIN:

  • Exon-Level Quantification: It represents genes as graphs, where nodes are exons and edges are junction reads, capturing detailed transcriptomic information at the exon level.
  • Better Cell Embedding: DOLPHIN leverages exon and junction read data to significantly improve the accuracy of cell embeddings, providing better resolution and resulting in more precise, biologically meaningful cell clusters compared to conventional gene-count based approaches.
  • Enhanced Alternative Splicing Detection: By aggregating exon and junction reads from neighboring cells, DOLPHIN significantly enhances the detection of alternative splicing events, providing deeper insights into cell-specific splicing patterns.
  • Superior Performance in Downstream Analysis: DOLPHIN consistently outperforms conventional gene-count methods in multiple downstream tasks, including the identification of differential exon markers and alternative splicing events. This high-resolution approach allows DOLPHIN to uncover biologically significant exon markers that are often missed by traditional methods.

Installation

Installing DOLPHIN directly from GitHub ensures you have the latest version. (Please install directly from GitHub to use the provided Jupyter notebooks for tutorials)

git clone https://github.com/mcgilldinglab/DOLPHIN.git
cd DOLPHIN

Creating and Activating the Conda Environment

conda env create -f environment.yaml
conda activate DOLPHIN

Installing the DOLPHIN Package

  1. Standard Installation
pip install .
  1. Developer Mode Installation
pip install -e .

Validate That DOLPHIN Is Successfully Installed

import DOLPHIN

Tutorials:

Dataset Preparation

  1. First, generate the exon-level reference GTF file by following the instructions in the exon_gtf_generation tutorial.

  2. Then, use the following tutorials to align the raw RNA-seq data and generate exon read counts and junction read counts:

  3. After aligning the RNA-seq data, generate the feature matrix and adjacency matrix using the provided methods in the tutorial.

Model Training and Cell Embedding Visualization

DOLPHIN Training and Cell Embedding

Run on example dataset:

You can download the processed dataset from here and follow the example to run the model.

Cell Aggregation

For a detailed tutorial on cell aggregation, please refer to the Cell Aggregation Tutorial.

Alternative Splicing Analysis

  1. Detecting Alternative Splicing using Outrigger: To detect alternative splicing events, please follow the Alternative Splicing Detection Tutorial.

  2. Alternative Splicing Analysis: This section explains the alternative splicing analysis performed as described in the manuscript. For a detailed tutorial, please refer to the Alternative Splicing Analysis.

Exon-Level Differential Gene Analysis

For a detailed walkthrough of the exon-level differential gene analysis, please follow this tutorial.

If you find the tool is useful to your study, please consider citing the DOLPHIN manuscript.

About

DOLPHIN: Advances Single-cell RNA-seq Analysis Beyond Gene-Level by Integrating Exon-Level Quantification and Junction Reads with Deep Neural Networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published