scRank_XMBD

Introduction

We proposed a data analysis framework to prioritize prognostic-associated subpopulations based on relative expression orderings (REOs). Cell type specific gene pairs (C-GPs) were identified to evaluate prognostic value for each cell type. Individualized recurrence risk signatures at single-cell resolution were developed based on REOs. The results shown that REOs-based signatures could classify accurately among most cell subtypes. C-GPs achieves higher precision compared with four current methods. Moreover, we developed single-cell gene pair signatures (scGPSs) to predict recurrence risk for patients individually.

Docker

We provide our R package conda enviroment images in docker hub
You can use docker in Linux to run the code directly. You can start with the following command.

docker pull watermelontreesjs/scrank_xmbd
docker run --name scRank_XMBD -itd watermelontreesjs/scrank_xmbd
docker exec -it scRank_XMBD /bin/bash
cd ./root/code/
conda activate scRank_XMBD
R

You can exit docker environment by

exit

Usage

In main.r , we provide the main steps for scRank construction. Including:

data availability

The example data stored in data and result.

library(Seurat)
library(clustree)
library(ggplot2)
library(dplyr)
library(Biobase)
library(CMScaller)
library(GSVA)
library(clusterProfiler)
library(patchwork)
library(ggpubr)
library(ggtext)
library(stringr)
library(RColorBrewer)
library(pheatmap)
library(ggrepel)
library(VennDiagram)
library(cowplot)
library(rstatix)
library(glmnet)
library(kableExtra)
library(survival)
library(singleCellNet)
library(CMSclassifier)
library(igraph)
library(tableone)
library(forestplot)
library(tidyverse)
library(scibet)
library(viridis)
library(survminer)
library(maxstat)
library(ggsci)
library(ggthemes)
library(org.Hs.eg.db)
library(scibet)
library(MetBrewer)
source("./dataPreprocess.r")
source("./utils.r")
source("./visulize.r")
source("./model.r")
set.seed(619)

Identification of cell type specific gene pairs with cell subpopulation-classifying value.

Applying SingleCellNet to bulid GP classifier

data <- readRDS("./result/GSE144735_SeuratObj_anno.rds")

# parameter setting 
ncells <- 10
nTopGenes <- 100
nTopGenePairs <- 250

# use tumor and border tissue
ClassList <- c("Tumor", "Border")

exp_matrix <- Interface_Seurat_SCN(data, ClassList)[["exp_matrix"]]
anno_matrix <- Interface_Seurat_SCN(data, ClassList)[["anno_matrix"]]
training_SCN_classifier(exp_matrix,anno_matrix, "GSE144735", nTopGenes, nTopGenePairs, ncells)

get cell subtype specific gene pair(C-GPs)

# load exp and clinical data
exp_matrix_list <- load_bulk_Exp()
clinic_list <- load_bulk_Clinical()
list_ <- combine_Datasets(exp_matrix_list, clinic_list)
training_exp <- list_[[1]]
training_clinical <- list_[[2]]

# parameter setting 
ncells <- 10
deltaS <- 0.6

average_exp_of_top_genepairs <- readRDS("./result/GSE144735(ncells=10)top_genepairs.rds")
specific_genepairs_list <- get_C_Gps(average_exp_of_top_genepairs, "GSE144735", deltaS, ncells)

# visualize the celltype-specific gene pairs
visualize_celltype_specific_genepairs(average_exp_of_top_genepairs, specific_genepairs_list)

Evaluation of prognostic value for each cell type.

# parameter setting 
clinical_cutoff <- 0.1

prognostic_CGPsList <- get_C_Gps_with_prognosis(training_exp,training_clinical,specific_genepairs_list, "GSE144735", ncells, deltaS, clinical_cutoff)

# plotting the interesting cell sub-populations
CellSubTypeList <- c("CD4+ Tfh", "CD8+ GZMK+ CTL", "Regulatory T cells", "IgA+ IGLC2+ Plasma B cell", "IgA+ IGLL5+ Plasma B cell","Macro_SPP1", "Macro_C1QC", "eCAF","myCAF_DES", "Fibro_SGK1")
major_celltype_df <- load_major_celltype_name()
 
boxplotForprognotic_CGPs(prognostic_CGPsList, major_celltype_df, CellSubTypeList, "GSE144735", ncells, deltaS, clinical_cutoff)

Development of individualized recurrence risk signatures.

build clinical signature

# load exp and clinical data
exp_matrix_list <- load_bulk_Exp()
clinic_list <- load_bulk_Clinical()
raw_clinical <- load_bulk_RawClinical()
validate_id <- c(1, 3, 9)
list_ <- split_Datasets(exp_matrix_list, clinic_list, validate_id)

# prepare training and test list
all_exp <- list_[[1]]
all_clinical <- list_[[2]]
training_size <- list_[[3]]
test_size <- list_[[4]]

# parameter setting 
scRNA_name_vector <- c("GSE144735", "GSE132465", "GSE132257") # you need to run steps above for different dataset("GSE144735", "GSE132465", "GSE132257")
iteration_times <- 1000
ncells <- 10
deltaS <- 0.6

# combine stable C-GPs from different scRNA-seq dataset
stable_pairs_list <- combine_StableCGPs(scRNA_name_vector, ncells, deltaS)
celltype_list <- names(stable_pairs_list)
validation_sets <- c("GSE14333", "GSE17536", "GSE39582")

# applied lasso-cox model to bulid signature
LassoCox_signature(stable_pairs_list, celltype_list, iteration_times, all_exp, all_clinical, training_size, test_size, validation_sets)

select the iteration successful

path <- "./model/"
iteration_result <- analysis_KMs(ncells, celltype_list, path)

# Lollipop chart
Lollipop_chart(iteration_result)

multiple variables cox

celltype_list <- list.files("./model/")

for (celltype in celltype_list) {
  path_ <- paste0("./model/", celltype)
  vali_clinical <- MultiCox_Data_Transform(path_, validation_sets, exp_matrix_list, clinic_list, raw_clinical)
 
  if(vali_clinical=="next"){next;}

  MultiCox(vali_clinical)
}

In REO Stablity in Bulk , we evaluate the REO of some gene-pairs in bulk RNA-seq dataset with ground truth.

In Methods Benchmark , we evaluate performance of five methods (scRank, CibersortX, GSEA, Scissor, Uni-Markers) used for prioritizing prognostic-associated subpopulations.

How to cite

Tong M, Lin Y, Yang W, Song J, Zhang Z, Xie J, Tian J, Luo S, Liang C, Huang J, Yu R. Prioritizing prognostic-associated subpopulations and individualized recurrence risk signatures from single-cell transcriptomes of colorectal cancer. Brief Bioinform. 2023 Mar 22:bbad078. doi: 10.1093/bib/bbad078. Epub ahead of print. PMID: 36946415.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scRank_XMBD

Introduction

Docker

Usage

data availability

Identification of cell type specific gene pairs with cell subpopulation-classifying value.

Applying SingleCellNet to bulid GP classifier

get cell subtype specific gene pair(C-GPs)

Evaluation of prognostic value for each cell type.

Development of individualized recurrence risk signatures.

build clinical signature

select the iteration successful

multiple variables cox

How to cite

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
Methods Benchmark		Methods Benchmark
REO Stablity in Bulk		REO Stablity in Bulk
data		data
figure		figure
model		model
result		result
.gitattributes		.gitattributes
README.md		README.md
dataPreprocess.r		dataPreprocess.r
main.r		main.r
model.r		model.r
utils.r		utils.r
visulize.r		visulize.r

xmuyulab/scRank-XMBD

Folders and files

Latest commit

History

Repository files navigation

scRank_XMBD

Introduction

Docker

Usage

data availability

Identification of cell type specific gene pairs with cell subpopulation-classifying value.

Applying SingleCellNet to bulid GP classifier

get cell subtype specific gene pair(C-GPs)

Evaluation of prognostic value for each cell type.

Development of individualized recurrence risk signatures.

build clinical signature

select the iteration successful

multiple variables cox

How to cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages