Enhancements to plot_segments function? #34

sigven · 2024-03-15T10:29:03Z

Hi,

Thanks for a very nice package with lots of great functionality for copy-number processing and visualization. This is really missing from other packages. Great work! I have been playing a bit with data from our tumor samples, and want to share some thoughts on the visualization function plot_segments:

I came across samples which have high-level and focal amplifications - these represent a category that contain targets of clinical significance, containing for instance amplified oncogenes that can be targeted by small molecule inhibitors. However, it's inherently hard to identify these candidates when looking at the segment plot. It would be very useful to highlight these in the plot, in a similar vein as somatic driver mutations can currently be highlighted. I guess now, there is no annotation of targets/genes and their potential druggability within the segments (cna track), but I think this would be very useful, at least in a clinical setting. On that note, also, it might be warranted to allow the Y-axis to be on a log2 scale, to more easily see high-level amplifications, considering e.g. very focal segments with total copy number between 15 and 30?
I've experimented a bit with feeding the output of plot_segments to plotly, i.e. plotly::ggplotly(CNAqc::plot_segments(x)), which, if working optimally, would give the users a powerful oppertunity to interact with the segments (and potentially annotations therein), being able to zoom in on particular chromosomes etc. I thought that restricting the chromosomes argument in plot_segments would give me a higher-resolution view of a particular chromosome, but it seems that the whole genome track is also plotted for this?

Happy to get your input on these matters:-)

kind regards,
Sigve

The text was updated successfully, but these errors were encountered:

caravagn · 2024-03-27T16:16:11Z

Hi @sigven, thanks for your comments:

I came across that need as well, but I could not find a special solution. Btw, did you notice the black/gray dots here above horizontal line at 0? Each dot is a breakpoint, and it's black if the segment has a Major+minor allele counts that exceeds the Y-axis plot. So there you see that there are certainly focal amplifications on chromsome 7, some with more than 6 copies... can you suggest something better? Maybe some ad-hoc visualisation?
the plotly thing seems usefull, can you give me some code to reproduce it? I can definitely change the behaviour of the chromosome argument, I never thought it would make much sense as it is done now..

caravagn · 2024-04-11T22:23:50Z

Hi @sigven still interested on this?

sigven · 2024-04-11T22:38:53Z

Hi @caravagn, yeah indeed! Sorry for not responding, I hope to get back to you shortly.

sigven · 2024-06-05T21:55:57Z

Hi @caravagn (cc @pdiakumis),

Truly sorry for the very late response and follow-up on this matter. We have been working hard to use your package inside PCGR, where we want to plot allele-specific copy number input from tumor samples for clinical interpretation. Notably, PCGR relies upon the Conda framework, and we have struggled to make CNAqc install and work in our production environment, particularly due to its multiple dependencies that are currently hosted in multiple GitHub repositories (meaning they are non-Condarized). Again, we think the plot is very elegant and the functionality is very good, and would be great to expand upon, but currently we are unable to make it work properly in our production environment.

kind regards,
Sigve

caravagn · 2024-06-08T13:25:38Z

Hi @sigven this is more clear now; I would love to see it used in your tool. Maybe we can help with this, let me talk with the guys and see if we can consider improving dependencies (some might be not strictly required).

Could you please clarify in which form you would like the package to be installable? Is the problem that the package depends on too many other packages for your internal use?

@luca-dex I might need you help on this.

caravagn · 2024-06-08T13:40:10Z

@luca-dex Regarding dependencies I think that

crayon, we use it for colours, maybe we can do everything with cli
ggpubr, we use to assemble plots, maybe the same plots can be obtained using a facet strategy and just ggplot
ggrepel, required to annotate driver events and fragmentation via geom_text_repel, but not-necessary if we switch to simpler type of annotations in baseline ggplot
BMix, required for peak detection
vcfR, used only in the vignettes
clisymbols, required with cli
RColorBrewer,
VariantAnnotation, GenomicRanges, AnnotationDbi, required to annotate driver events
easypar, required for automatic loops
gtools, required only for its mixedsort
akima, required for auto-tolerance
cowplot, redundant with ggpubr
ggsci, we use it for colours, maybe we can just explcit colour codes
peakPick, required for peaks
readr, required for VEP inputs
ComplexHeatmap, required for cohort plots

pdiakumis · 2024-06-08T14:14:35Z

Hello Giulio,
Thanks for getting back to us, but please don't spend time on refactoring your codebase for our needs. As Sigve mentioned, we are only interested in the CNA segment plotting functionality that comes with CNAqc (which we could probably handle with a custom ggplot2 + plotly solution if we wanted some interactivity).
The way we handle dependencies in PCGR is via conda (https://docs.conda.io/en/latest/) which is a great tool for managing R/Py/Perl pkgs and other bioinformatics tools.
R pkgs can be converted into conda pkgs fairly easily if they are on CRAN/Bioconductor. Things get a bit trickier if they're only on GitHub, especially if their dependencies are also GitHub-only.
From CNAqc's DESCRIPTION I can see multiple caravagn/caravagnalab dependencies (which is perfectly fine, I do that a lot too!). But these would be required to be firstly tagged and released, and then condarised themselves. There are probably other dependencies in there that would also need to be condarised. This is an arduous process and frankly not worth the effort for one plot.
We are currently handling the installation of CNAqc into our conda environments simply by using remotes::install_github("caravagnalab/CNAqc"). This adds an extra 15-20min to our GitHub Actions pipeline, but works fine on Linux (but not MacOS). It's okay for now, but we will be looking into replacing this at some point.

caravagn · 2024-06-10T08:22:01Z

Hi @pdiakumis, I see, thanks for clarifying, I can then close this.

Then maybe the best for you is to re-use our code from

Best,

Giulio

caravagn self-assigned this Mar 27, 2024

caravagn added the enhancement New feature or request label Mar 27, 2024

caravagn assigned luca-dex Jun 8, 2024

caravagn added the help wanted Extra attention is needed label Jun 8, 2024

caravagn closed this as completed Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancements to plot_segments function? #34

Enhancements to plot_segments function? #34

sigven commented Mar 15, 2024

caravagn commented Mar 27, 2024

caravagn commented Apr 11, 2024

sigven commented Apr 11, 2024

sigven commented Jun 5, 2024

caravagn commented Jun 8, 2024

caravagn commented Jun 8, 2024

pdiakumis commented Jun 8, 2024

caravagn commented Jun 10, 2024

Enhancements to plot_segments function? #34

Enhancements to plot_segments function? #34

Comments

sigven commented Mar 15, 2024

caravagn commented Mar 27, 2024

caravagn commented Apr 11, 2024

sigven commented Apr 11, 2024

sigven commented Jun 5, 2024

caravagn commented Jun 8, 2024

caravagn commented Jun 8, 2024

pdiakumis commented Jun 8, 2024

caravagn commented Jun 10, 2024