-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Micah's RNA+ADT+HTO update using UM5 and Michael porjects as templates #14
base: main
Are you sure you want to change the base?
Conversation
… and it is just parrotting cellranger outputs) and updated formatting.
…easier to switch between RNA vs RNA+HTO+ADT example datasets.
…ulti or count output structure.
…RNA` and `template_ADT` instead of the overly verbose `template_RNA+HTO+ADT`
… workflow to allow us to plot the UMAPs before and after removing doublets in the doublet section of the formatted report.
…format file with `20-APC-subset+recluster` scripts from p21242_Satish_UM5_Analysis
…yers() workflow used in Michael project.
… Also add sample metadata to seurat object.
s.split.unfilt <- lapply(names(s.split.unfilt), function(capID) { #s.split.unfilt | ||
s.split.unfilt[[capID]]@assays$RNA@layers$counts <- | ||
sc[[capID]]$adjusted_counts[,colnames(s.split.unfilt[[capID]])] | ||
return(s.split.unfilt[[capID]]) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails with Error in .subscript.2ary(x, , j, drop = TRUE) : subscript out of bounds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scRNAseq_template/analysis_scripts/01-processing-template_ADT+soupX.runfile.Rmd
Lines 513 to 517 in 8484950
# Add capID to front of cell barcode to prevent collisions when merging across clusters | |
sc <- lapply(sc, function(x) { | |
x$adjusted_counts@Dimnames[[2]] <- paste0(capID, "_", x$adjusted_counts@Dimnames[[2]]) | |
return(x) | |
}) |
This error seems to be caused by a bug in the above part where all capIDs are replaced by the last capID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below is a fix for this issue.
s.split.unfilt <- lapply(names(s.split.unfilt), function(capID) { #s.split.unfilt | |
s.split.unfilt[[capID]]@assays$RNA@layers$counts <- | |
sc[[capID]]$adjusted_counts[,colnames(s.split.unfilt[[capID]])] | |
return(s.split.unfilt[[capID]]) | |
}) | |
# Add capID to front of cell barcode to prevent collisions when merging across clusters | |
for (capID in names(sc)) { | |
sc[[capID]]$adjusted_counts@Dimnames[[2]] <- paste0(capID, "_", sc[[capID]]$adjusted_counts@Dimnames[[2]]) | |
} | |
## Initialize a flag to track if all columns are present | |
all_present <- TRUE | |
## Check for missing columns in each capID | |
for (capID in names(sc)) { | |
missing_cols <- setdiff(colnames(s.split.unfilt[[capID]]), colnames(sc[[capID]]$adjusted_counts)) | |
if (length(missing_cols) > 0) { | |
warning(paste0("The following columns are missing in adjusted_counts for ", capID, ": ", paste(missing_cols, collapse = ", "))) | |
all_present <- FALSE # Set flag to FALSE if any missing columns are found | |
} else { | |
message(paste0("All columns present for ", capID)) | |
} | |
} | |
# Replace cellranger filtered counts with soupx adjusted counts of filtered cells | |
if (all_present) { | |
## Execute lapply only if all columns are present in all capIDs | |
message("All columns are present. Replacing cellranger filtered counts with soupx adjusted counts.") | |
s.split.unfilt <- lapply(names(s.split.unfilt), function(capID) { | |
s.split.unfilt[[capID]]@assays$RNA@layers$counts <- | |
sc[[capID]]$adjusted_counts[, colnames(s.split.unfilt[[capID]])] | |
return(s.split.unfilt[[capID]]) | |
}) | |
names(s.split.unfilt) <- names(sc) | |
} else { | |
# if all columns are not present, stop the script | |
stop("Not all columns are present. The replacement was not executed.") | |
} |
|
||
write_rds(s.split.unfilt, here(paste0("saved_rds/report-", report_number), "s.split.unfilt-", getDate(), ".rds")) # for filtering plots | ||
write_rds(s.demux, here(paste0("saved_rds/report-", report_number), "s.demux-", getDate(), ".rds")) | ||
write_rds(s.filtADT.split, here(paste0("saved_rds/report-", report_number), "s.filtADT.split-", getDate(), ".rds")) | ||
write_rds(s.filtADT, here(paste0("saved_rds/report-", report_number), "s.filtADT-", getDate(), ".rds")) # For downstream processing | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are bugs in write_rds. paste0()
is missing when getDate() is used.
write_rds(s.split.unfilt, here(paste0("saved_rds/report-", report_number), "s.split.unfilt-", getDate(), ".rds")) # for filtering plots | |
write_rds(s.demux, here(paste0("saved_rds/report-", report_number), "s.demux-", getDate(), ".rds")) | |
write_rds(s.filtADT.split, here(paste0("saved_rds/report-", report_number), "s.filtADT.split-", getDate(), ".rds")) | |
write_rds(s.filtADT, here(paste0("saved_rds/report-", report_number), "s.filtADT-", getDate(), ".rds")) # For downstream processing | |
``` | |
write_rds(singler_out, here(paste0("saved_rds/report-", report_number), paste0("singler_out-", getDate(), ".rds"))) | |
write_rds(s.split.unfilt, here(paste0("saved_rds/report-", report_number), paste0("s.split.unfilt-", getDate(), ".rds"))) # for filtering plots | |
write_rds(s.demux, here(paste0("saved_rds/report-", report_number), paste0("s.demux-", getDate(), ".rds"))) | |
write_rds(s.filtADT.split, here(paste0("saved_rds/report-", report_number), paste0("s.filtADT.split-", getDate(), ".rds"))) | |
write_rds(s.filtADT, here(paste0("saved_rds/report-", report_number), paste0("s.filtADT-", getDate(), ".rds"))) # For downstream processing |
…d validate column presence before replacing counts * fix: output file paths in section 6
Note that I wasn't able to finish all the planned changes before I left Emory.
I left off partway through the integration and annotation runfile. You can use the UM5 subsetting runfiles/formatfiles and michael integration+annotation runfiles/formatfiles as references to finish that.
The next step would be to add the pseudobulk script from the Michael project. Fingers crossed this slots in seamlessly with little changes required.
Also note that there is a bug in the processing runfile at seurat object creation due to package conflicts. I think this could be resolved by updating R to the latest version with a bioconductor base docker image.
Good luck!