A self-adaptive and versatile tool for eliminating multiple undesirable variations from transcriptome
Codes and tutorial for A self-adaptive and versatile tool for eliminating multiple undesirable variations from transcriptome.
We add scripts for fine-tuning. Please install it with
$ pip install deepadapter==1.1.0
Note: only deepadapter (v1.1.0) supports the fine-tuning
Step 1: create a new conda environment
$ # Create a new conda environment
$ conda create -n DA python=3.9
$ # Activate environment
$ conda activate DA
Step 2: install the package with pip
$ # Install the our package
$ pip install deepadapter==1.1.0
Step 3: confirm that torch+cuda is installed
$ python
>>> import torch
>>> torch.cuda.is_available()
If the output of torch.cuda.is_available()
is True
, then torch+cuda was installed successfully.
If not, please install torch+cuda with
$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Step 4: launch jupyter notebook and double-click to open tutorials
$ # Launch jupyter notebook
$ jupyter notebook
After opening the tutorials, please press Shift-Enter to execute a "cell" in .ipynb
.
Before runing the codes, download our tutorials.
DA-Example-Tutorial.ipynb
: the tutorial of re-training DeepAdapter using the example dataset (click here to download);DA-YourOwnData-Tutorial.ipynb
: the tutorial of training DeepAdapter using your own dataset (click here to download).
Before fine-tuning, make sure that the gene set in the small dataset the same as the the gene set used in pretrained models.
The gene set used in pretrained models can be found in trained_models/[model]/genes.csv
. The order of gene set does not matter.
Double-click to open tutorials after launching jupyter notebook
DA-Example-finetune.ipynb
: the tutorial of fine-tuning DeepAdapter using the example dataset (click here to download);;DA-YourOwnData-finetune.ipynb
: the tutorial of fine-tuning DeepAdapter using your own dataset (click here to download);.
After opening the tutorials, please press Shift-Enter to execute a "cell" in .ipynb
.
The benchmarking methods can be found in Benchmarking-methods.ipynb
(click here to download) and Benchmarking-MNN.py
(click here to download);.
Step 1: run methods except MNN
- installation instructions: find the installation cmds in
Benchmarking-methods.ipynb
- run benchmarking methods: choose the benchmarking method and run it
Step 2: run MNN with the followsing cmds in a new shell
Before running, ensure that the codes in mnn_utils/
(click here to download) for loading the dataset are the same hierarchy as this tutorial. To run MNN, please create the environment with th following codes:
$ conda create -n py3.8 python=3.8
$ conda activate py3.8
$ pip install mnnpy==0.1.9.5 matplotlib tqdm umap-learn openpyxl scipy==1.5.4
$ python Benchmarking-MNN.py
Please download the open datasets in Zenodo. These datasets are collected from literatures to demonstrate multiple unwanted variations, including:
- batch datasets: LINCS-DToxS (van Hasselt et al. Nature Communications, 2020) and Quartet project (Yu, Y. et al. Nature Biotechnology, 2023).
- platform datasets: profiles from microarray (Iorio, F. et al. Cell, 2016) and RNA-seq (Ghandi, M. et al. Nature, 2019).
- purity datasets: profiles from cancer cell lines (Ghandi, M. et al. Nature, 2019) and tissues (Weinstein, J.N. et al. Nature genetics, 2013).
After downloading, place the datasets in the data/
directory located in the same hierarchy as this tutorial.
- batch datasets:
data/batch_data/
- platform datasets:
data/platform_data/
- purity datasets:
data/purity_data/
Please find the pretrained models in folder models
(click here to download).
- batch integration:
models/batch_LINCS
andmodels/batch_Quartet
- platform integration:
models/platform
- purity integration:
models/purity
After downloading, place the models in the models/
directory located in the same hierarchy as this tutorial.