Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help understanding getcirclize #239

Closed
MeganS92 opened this issue Jun 9, 2023 · 6 comments
Closed

Help understanding getcirclize #239

MeganS92 opened this issue Jun 9, 2023 · 6 comments

Comments

@MeganS92
Copy link

MeganS92 commented Jun 9, 2023

Hi, thanks for this wonderful tool! So I've been trying to use getCirclize to show how clonal TCR pairs are related across my clusters. But I have trouble understanding how are they summarising this in the dataframe output.

So this is the number of clones I have, categorized into either singletons (single), small expanded (1<clone<==5), and medium (5<clone<=20).
table(meta$cloneType2)

Medium Single Small
38 3696 138

If I run getcirclize onto the entire dataset, using something like:
circles <- getCirclize(Map.sct, group.by = "celltype")
This is my dataframe output:

from    to      value

1 PD1n PD1n 2117
5 PD1n Tfh1 1
6 Tfh1 Tfh1 767
9 PD1n Tfh17 0
10 Tfh1 Tfh17 0
11 Tfh17 Tfh17 463
13 PD1n Tfh2 0
14 Tfh1 Tfh2 1
15 Tfh17 Tfh2 3
16 Tfh2 Tfh2 416

Now, I only want to plot the clonotype relationship between actual sister clones, ie ignoring all the singletons and only looking at clones that actually expanded. So I made a subset using clonetype, using:

Map.sct2 <- subset(Map.sct, subset = cloneType2 %in% c("Small", "Medium"))

What I don't get is why don't the numbers add up when I re-run circlize on this subset of data?

circles2 <- getCirclize(Map.sct2, group.by = "celltype")

output:

 from    to       value

1 PD1n PD1n 26
5 PD1n Tfh1 0
6 Tfh1 Tfh1 39
9 PD1n Tfh17 0
10 Tfh1 Tfh17 0
11 Tfh17 Tfh17 4
13 PD1n Tfh2 0
14 Tfh1 Tfh2 0
15 Tfh17 Tfh2 0
16 Tfh2 Tfh2 8

So take for example, in the full dataset, there is 1 clonotype shared between PD1n and Tfh1 (row no.5 of circles and circles2). Why is this shared clonotype gone in my subset of data where only expanded clones are present? If this shared clonotype is part of those that are singletons, how can it be shared?

Am I understanding this function wrongly? Thanks a lot for your help!

@ncborcherding
Copy link
Member

Hey Megan,

Thanks for reaching out! I'm on a phone right now, so please check code - apologies!

I think the origin of the confusion is that getCircilize() is not demonstrating cell numbers with shared clones, but the actual clones themselves.

Using your code to highlight expanded clones:

Map.sct2 <- subset(Map.sct,  subset = cloneType2 %in%  c("Small", "Medium"))
circles2 <- getCirclize(Map.sct2, group.by = "celltype")

The better comparison is looking at the individual clone numbers and not cell numbers, something like:

subset <- unique(Map.sct2[[]][,c("CTstrict","cloneType")] #get only unique clones
table(subset)

Does that make sense? I might need to modify the manual to make that more apparent. Let me know if you have any other questions or suggestions.

Thanks,
Nick

@MeganS92
Copy link
Author

MeganS92 commented Jun 12, 2023

Hi Nick,

Thanks for getting back so quickly. Let's see if I get your explanation right. Sorry, please bear with me.

So using my example, in dataset circles, there are 2117 unique clones within the PD1n cluster that are not shared with any other clusters, and 1 unique clone shared between the PD1n and Tfh1 cluster. Is this right? Of the 2117 clones, most of them will be singletons with a few that are clonally expanded too. For that 1 unique clone that is shared between PD1n and Tfh1 clusters though, I would assume that it would a clonally expanded clone, otherwise how can it be shared. Is this also right?

Now, in the circles2 dataset where I have only highlighted expanded clones, there are 26 unique clones within the PD1n cluster, and none that are shared between PD1n and any other clusters. So I am still confused in this instance. Even if I am looking at actual clones and not cell numbers, why is that 1 unique clone that was shared between PD1n and Tfh1 in the circles dataset not detected in my clonally expanded subset?

Does this imply that that clone is a singleton, but if it is, how can it be shared between two cells?

Sorry if i should have caught on with your previous explanation but I still have some trouble understanding this.

Thanks so much for your help!

@ncborcherding
Copy link
Member

Hey Megan,

Do not worry this is the probably the most confusing of the graphs in scRepertoire (in my opinion). I think where major confusion comes in is the difference in shuffling of clones.

Clone Shuffling Steps:

  1. combineExpression() will base cloneType or expansion based on the group.by variable or by the sequencing run if left NULL - this is the first step in group comparator.
  2. circles <- getCirclize(Map.sct, group.by = "celltype") is basing clone numbers now on celltype, which is likely different then step 1.
  3. circles2 <- getCirclize(Map.sct2, group.by = "celltype") is not a subset of clones using step 1, but then grouped like step 2.

The filtering step may be removing clones that are not in the same classification of cloneType() based on the combineExpression() call. If that makes any sense? Do you have mulitple experimental runs for your data? What are you combining in combineExpression()? I think there is a way around this by filtering/subseting on the basis of clones and not expansion.

Let me know and happy to work on troubleshooting.

Thanks,
Nick

@ncborcherding
Copy link
Member

I am closing this for now - please let me know if you have any additional questions or suggestions.

@christoforos-dimitropoulos
Copy link

christoforos-dimitropoulos commented Jan 22, 2025

Hey Megan,

Thanks for reaching out! I'm on a phone right now, so please check code - apologies!

I think the origin of the confusion is that getCircilize() is not demonstrating cell numbers with shared clones, but the actual clones themselves.

Using your code to highlight expanded clones:

Map.sct2 <- subset(Map.sct,  subset = cloneType2 %in%  c("Small", "Medium"))
circles2 <- getCirclize(Map.sct2, group.by = "celltype")

The better comparison is looking at the individual clone numbers and not cell numbers, something like:

subset <- unique(Map.sct2[[]][,c("CTstrict","cloneType")] #get only unique clones
table(subset)

Does that make sense? I might need to modify the manual to make that more apparent. Let me know if you have any other questions or suggestions.

Thanks, Nick

Hey Nick,

this is something that is probably discussed elsewhere but I was wondering: since the circle plot shows the actual number of clonotypes, I was expecting that when you plot it with proportion=TRUE, the frequencies of the clones would be plotted. However in some clusters, I see that the circle goes from 0 to 1.4 for example and not just up to 1. Is there something I am missing in the calculation here? and how one interpret these plots?
Thanks a lot in advance

@ncborcherding
Copy link
Member

@christoforos-dimitropoulos

Thank you for bringing up this issue! The values exceeding 1 when proportion = TRUE are likely due to the normalization step, where the denominator (length(unique(clone.table[clone.table[,1] == pair2, cloneCall]))) might not always reflect the correct number of unique clones in pair2, especially in self-comparisons` (pair1 == pair2)' or edge cases where clones are shared across multiple groups.

I just pushed the fix into the dev branch and will get things tested (I need to update unit tests as well).

Nick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants