-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Count Matrix before binarization and artificial doublets simulation #21
Comments
Hi Yuntian, Quickly looking at the code, it looks like it's just producing the binarized matrix without generating the counts of the instances. You may be able to modify the generateMatrix function in AMULET.py by changing this line:
to increment the counter by 1 instead of setting it to 1. However this is just counting the occurrences of overlaps > 2 within the merged union locations. If that is not what you are looking for, the Overlaps.txt file, provides the coordinates that can be traced back to the fragment/bam file where the overlap occurred. With this you will have more control over the count matrix you want to generate. For artificial multiplets, you essentially, combine the accessible chromatin profiles of two cells. Since each cell corresponds to a barcode, this is a matter of selecting 2 barcodes and combining the reads assigned to that barcode. Essentially the steps are:
Adjusting the fragment files will be easier as this is a matter of making a new fragment file, adding/editing the tab delimited fields, and then indexing that file. If this is CellRanger ATAC, you can read more about the format here: https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/output/fragments Hope this helps! Best, |
Hi Asa,
Then we would like to create a new "barcodes", e.g. ArificialDbl1, the things I only need to change is the singlecell.csv file if I don't wanna keep the original cell data? I was just wondering how should I change the singlecell.csv file? Could you please explain a little bit more about this? In addition, just wanna confirm, when we run the multiplet detection with artificial doublets, the artificial doublets should be included in the calculation of the average of rowsums right(to get the lamba of poisson distribution)? In this case, then the number of artificial doublets should be very small (in your paper 2.5%). But if we would like to generate a higher proportion of artificial doublets, then they are not supposed to be included in the lambda calculation, am I right? Thank you so much for your help! Many thanks, |
You'll need to update both the singlecell.csv file and the fragment file. You will need to add a new rows in the singlecell.csv file with your doublet barcodes. If you are excluding the original cells/barcodes, remove them from this file as well. For the lambda estimation, yes, the doublets were included in that calculation since there is no way to know what the singlets are for the correct estimate when applying the algorithm in real case scenarios. So with larger number of doublets, this background will start losing sensitivity with the increase in the background average from them. Best, |
Hi AMULET team,
Thank you so much for developing this great method first! So sorry that I am quite new to this area and have some basic questions.
I noticed that in the multiplet detection step, the matrix we generated is already binarized. My first question is can we have the count matrix before binarization by any chance? I tried to repeat your method using the seurat count matrix, and I would like to use the original count matrix generated by AMULET to check the concordance。
The second question I have is that I am still confused about how can I generate artificial doublets to assess the detection accuracy. I have seen your answer in https://github.com/UcarLab/AMULET/issues/16, but still didn't quite get it. It would be great if you could share a sample script to do that or give a specific example of the process.
Many thanks,
Yuntian.
The text was updated successfully, but these errors were encountered: