Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply scMET on sliding windows #5

Open
nshen7 opened this issue Jan 9, 2023 · 3 comments
Open

Apply scMET on sliding windows #5

nshen7 opened this issue Jan 9, 2023 · 3 comments

Comments

@nshen7
Copy link

nshen7 commented Jan 9, 2023

Hello Dear Andreas,

I was trying to apply scMET on a large-scale scBS-seq dataset using non-overlapping sliding windows of 20kb. I noticed that you has suggested in the scMET paper:

In the spirit of divide-and-conquer schemes, we bypass this problem via a parallelization strategy in which we apply scMET separately to each chromosome. Feature-specific estimates obtained for each chromosome can be combined post hoc when performing HVF selection and differential analyses.

Do you have any instructions on how to combine the estimates post hoc? Or any functions developed for that purpose? Thanks in advance! Looking forward to hearing you back.

Best,
Ning

@andreaskapou
Copy link
Owner

Dear Ning,

Regarding your question, here is the code used in the paper (for the Ecker2017 dataset) for:

  1. Running scMET on each chromosome: https://github.com/andreaskapou/scMET-analysis/blob/master/ecker2017/all_cells/00_run/fit_scmet_window.R
  2. Combining the estimates of each chromosome and performing HVF analysis: https://github.com/andreaskapou/scMET-analysis/blob/master/ecker2017/all_cells/01_hvf/hvf_window.Rmd

Hope this helps! Please let me know if you have any other questions.

best,
Andreas

@nshen7
Copy link
Author

nshen7 commented Mar 8, 2023

Hello Andreas,

Thanks for the reply! I really appreciate the help.

I just wanna make sure that I understood your code correctly - I don't have to combine the results from scmet function and then apply the scmet_hvf, right? All you did was putting together HVFs from each chromosome and consider them as the set of HVFs from the entire genome?

Best,
Ning

@andreaskapou
Copy link
Owner

Dear Ning,

Yes, the way we did the analysis (line https://github.com/andreaskapou/scMET-analysis/blob/cd8700dc15e6eff590eafe0864ba17b94cb4ad23/ecker2017/all_cells/01_hvf/hvf_window.Rmd#L100) is to combine the output HVFs for each chromosome and then sort by (residual) overdispersion to extract the top N HVFs.

One thing to note though with this approach, is to check tha the mean-overdispersion relationship is similar across chromosomes, otherwise your results might be biased towards certain chromosomes.

Best,
Andreas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants