Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the Problem of "patient_id_sample" in Calculating LTSR #2

Open
kaqisekuzi opened this issue May 17, 2024 · 3 comments
Open

About the Problem of "patient_id_sample" in Calculating LTSR #2

kaqisekuzi opened this issue May 17, 2024 · 3 comments

Comments

@kaqisekuzi
Copy link

@nh3 @brianpenghe @RikLindeboom Hi,
I saw "patient_id_sample" in the code below you. May I ask if "patient_id_sample" is the sample information (such as when I sequenced and measured 6 samples, 3 blanks, and 3 treatments)? Or does it refer to "sample ID+cell ID"? Is “[email protected]$pool_name” the cell ID of a single cell ?
Thank you very much. Looking forward to your reply.

[email protected]$patient_id_sample <- paste0([email protected]$patient_id,";",[email protected]$pool_name) # Using our multiplexing approach, we have replicates for each library

Y = table([email protected]$patient_id_sample,[email protected]$cell_annot_revision_short)

mymetadata <- [email protected][!duplicated([email protected]$patient_id_sample),]

@RikLindeboom
Copy link
Collaborator

Hi @kaqisekuzi,

All PBMC samples were split into replicates and each replicate was pooled with samples from other patients. patient_id_sample is the patient identifier and the pool identifier (sequencing library name) concatenated. This way we generate replicates to account for technical variation, and when fitting each GLMM, we used these replicates to strengthen our analyses.

Hope this helps. Best wishes, Rik Lindeboom

@kaqisekuzi
Copy link
Author

Hi @RikLindeboom ,
Thank you very much for your answer. I have two more questions
Here is my data information

> table(meta$orig.ident)
SeuratProject
57181

> table(meta$sample_number)
RA23H0014 RA23H0016 RA23H0017 RA23H0018 RA23H0020 RA23H0023 RA23H0024 RA23H0025
7549 2512 3334 4469 7401 8410 2552 5856
RA23H0027 RA23H0028 RA23H0029
4316 2284 8498

> table(meta$G_age)
middle young
32439 17193

> table(meta$G_sex)
female male
29432 27749

1,My 11 samples come from 11 individuals and there are no duplicate samples. My data situation is: 11 samples, each with only one library. After integrating the 11 samples, sample information was placed in the "sample_number" column, with only two sets of biological variables placed in ”G_sex“、 “G_age”, I want to calculate the multiple changes in the proportion of cell types under different age groups (G_age: middle, young). Do you think a Poisson generalized linear mixed model can be used?

2,If two biological variables can be used as Poisson's generalized linear mixed models,I would like to observe the multiple changes in cell type ratios under age and gender. Do you think this is feasible?

Looking forward to your reply.

@kaqisekuzi
Copy link
Author

kaqisekuzi commented May 24, 2024

Hi @RikLindeboom
Here is the code I used,
There are a few small issues:

  1. Dotplot color is not obvious, using the Dotplot function you wrote, already using zlim=c (log (1/2), Log (2) has adjusted the color depth, but it is still not very noticeable.
  2. Can the Dotplot legend be drawn by modifying parameters? What parameters need to be modified

args <- commandArgs(T)
source('./RL003_function_collection_GitHub.R')
library(Seurat)
library(ggplot2)
library(lme4)
library(showtext)
showtext_auto()
library(egg)
library('numDeriv')

##load data
cv <- readRDS("/rmRP_setDims_addsubset_celltype_addGroup_addAgeSize.rds")
Y = table([email protected]$sample_number,[email protected]$celltype_add_subset)

mymetadata <- [email protected][!duplicated([email protected]$sample_number),]

metadata <- mymetadata[,c("G_sex","G_age","sample_number")]
Y <- Y[rownames(Y)%in%metadata$sample_number,]

##number of samples / number of cell types
nsamples = nrow(Y)
ncells = ncol(Y)

##repeating the meta data table by the number of cell types
metadataExp=cbind(metadata[rep(match(rownames(Y),as.character(metadata$sample_number)),ncells),],Celltype=rep(colnames(Y),rep(nsamples,ncells)))

res.prop=glmer(I(c(Y))~
(1|Celltype)
+(1|sample_number)
+(1|G_sex)
+(1|G_age)

+(1|sample_number:Celltype)
+(1|G_sex:Celltype)
+(1|G_age:Celltype)
,
family=poisson,data=metadataExp,control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)))

##standard errors of standard deviations (squre root of the variance parameters)
devfun = update(res.prop, devFunOnly=T)
pars = getME(res.prop, c("theta","fixef"))
hess = hessian(devfun, unlist(pars))
sdse.prop = data.frame(sd=unlist(pars), se=sqrt(diag(solve(hess))))

##posterior means and their standard deviations
res.prop.ranef = ranef(res.prop)

##Forest plot
rownames(sdse.prop)[rownames(sdse.prop)=="theta.Celltype.(Intercept)"] <- "Residual"
pdf("/test2_Forest_plot.pdf", height = 7, width = 10)
par(mar=c(3,6,1,1),mgp=c(1.2,0.5,0))
Forest(sdse.prop[grep("(Celltype|Residual)",rownames(sdse.prop)),],xlim=c(0,2.5))
dev.off()

postmean = cbind(
getCondVal(res.prop.ranef,"G_age:Celltype",ncells,celltype=colnames(Y))[[1]],
NA,
getCondVal(res.prop.ranef,"G_sex:Celltype",ncells,celltype=colnames(Y))[[1]]
)

lfsr = cbind(
getCondVal(res.prop.ranef,"G_age:Celltype",ncells,celltype=colnames(Y))[[2]],
NA,
getCondVal(res.prop.ranef,"G_sex:Celltype",ncells,celltype=colnames(Y))[[2]]
)

##Dotplot
postmean_oldAgeGroupsPlusSeverity <- postmean
lfsr_oldAgeGroupsPlusSeverity <- lfsr

myClust <- hclust(dist(postmean_oldAgeGroupsPlusSeverity*(1-lfsr_oldAgeGroupsPlusSeverity)),method = "complete")$order
postmean_oldAgeGroupsPlusSeverity <- postmean_oldAgeGroupsPlusSeverity[myClust,]
lfsr_oldAgeGroupsPlusSeverity <- lfsr_oldAgeGroupsPlusSeverity[myClust,]

pdf("/test2_dotplot1_plot.pdf", height = 7, width = 10)
##par(mar=c(13,12,8,10))
par(mar=c(8,15,0,10))
Dotplot(postmean_oldAgeGroupsPlusSeverity, SORT=c(F,F),zlim=c(log(1/2),log(2)),ltsr=1-lfsr_oldAgeGroupsPlusSeverity, cex=0.8,srt=90,cex.axis = .8)
dev.off()

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants