Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to plot_segments function? #34

Closed
sigven opened this issue Mar 15, 2024 · 8 comments
Closed

Enhancements to plot_segments function? #34

sigven opened this issue Mar 15, 2024 · 8 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@sigven
Copy link

sigven commented Mar 15, 2024

Hi,

Thanks for a very nice package with lots of great functionality for copy-number processing and visualization. This is really missing from other packages. Great work! I have been playing a bit with data from our tumor samples, and want to share some thoughts on the visualization function plot_segments:

  • I came across samples which have high-level and focal amplifications - these represent a category that contain targets of clinical significance, containing for instance amplified oncogenes that can be targeted by small molecule inhibitors. However, it's inherently hard to identify these candidates when looking at the segment plot. It would be very useful to highlight these in the plot, in a similar vein as somatic driver mutations can currently be highlighted. I guess now, there is no annotation of targets/genes and their potential druggability within the segments (cna track), but I think this would be very useful, at least in a clinical setting. On that note, also, it might be warranted to allow the Y-axis to be on a log2 scale, to more easily see high-level amplifications, considering e.g. very focal segments with total copy number between 15 and 30?
  • I've experimented a bit with feeding the output of plot_segments to plotly, i.e. plotly::ggplotly(CNAqc::plot_segments(x)), which, if working optimally, would give the users a powerful oppertunity to interact with the segments (and potentially annotations therein), being able to zoom in on particular chromosomes etc. I thought that restricting the chromosomes argument in plot_segments would give me a higher-resolution view of a particular chromosome, but it seems that the whole genome track is also plotted for this?

Happy to get your input on these matters:-)

kind regards,
Sigve

@caravagn
Copy link
Collaborator

Hi @sigven, thanks for your comments:

  • I came across that need as well, but I could not find a special solution. Btw, did you notice the black/gray dots here above horizontal line at 0? Each dot is a breakpoint, and it's black if the segment has a Major+minor allele counts that exceeds the Y-axis plot. So there you see that there are certainly focal amplifications on chromsome 7, some with more than 6 copies... can you suggest something better? Maybe some ad-hoc visualisation?

  • the plotly thing seems usefull, can you give me some code to reproduce it? I can definitely change the behaviour of the chromosome argument, I never thought it would make much sense as it is done now..

@caravagn caravagn self-assigned this Mar 27, 2024
@caravagn caravagn added the enhancement New feature or request label Mar 27, 2024
@caravagn
Copy link
Collaborator

Hi @sigven still interested on this?

@sigven
Copy link
Author

sigven commented Apr 11, 2024

Hi @caravagn, yeah indeed! Sorry for not responding, I hope to get back to you shortly.

@sigven
Copy link
Author

sigven commented Jun 5, 2024

Hi @caravagn (cc @pdiakumis),

Truly sorry for the very late response and follow-up on this matter. We have been working hard to use your package inside PCGR, where we want to plot allele-specific copy number input from tumor samples for clinical interpretation. Notably, PCGR relies upon the Conda framework, and we have struggled to make CNAqc install and work in our production environment, particularly due to its multiple dependencies that are currently hosted in multiple GitHub repositories (meaning they are non-Condarized). Again, we think the plot is very elegant and the functionality is very good, and would be great to expand upon, but currently we are unable to make it work properly in our production environment.

kind regards,
Sigve

@caravagn
Copy link
Collaborator

caravagn commented Jun 8, 2024

Hi @sigven this is more clear now; I would love to see it used in your tool. Maybe we can help with this, let me talk with the guys and see if we can consider improving dependencies (some might be not strictly required).

Could you please clarify in which form you would like the package to be installable? Is the problem that the package depends on too many other packages for your internal use?

@luca-dex I might need you help on this.

@caravagn caravagn added the help wanted Extra attention is needed label Jun 8, 2024
@caravagn
Copy link
Collaborator

caravagn commented Jun 8, 2024

@luca-dex Regarding dependencies I think that

  • crayon, we use it for colours, maybe we can do everything with cli
  • ggpubr, we use to assemble plots, maybe the same plots can be obtained using a facet strategy and just ggplot
  • ggrepel, required to annotate driver events and fragmentation via geom_text_repel, but not-necessary if we switch to simpler type of annotations in baseline ggplot
  • BMix, required for peak detection
  • vcfR, used only in the vignettes
  • clisymbols, required with cli
  • RColorBrewer,
  • VariantAnnotation, GenomicRanges, AnnotationDbi, required to annotate driver events
  • easypar, required for automatic loops
  • gtools, required only for its mixedsort
  • akima, required for auto-tolerance
  • cowplot, redundant with ggpubr
  • ggsci, we use it for colours, maybe we can just explcit colour codes
  • peakPick, required for peaks
  • readr, required for VEP inputs
  • ComplexHeatmap, required for cohort plots

@pdiakumis
Copy link

Hello Giulio,
Thanks for getting back to us, but please don't spend time on refactoring your codebase for our needs. As Sigve mentioned, we are only interested in the CNA segment plotting functionality that comes with CNAqc (which we could probably handle with a custom ggplot2 + plotly solution if we wanted some interactivity).
The way we handle dependencies in PCGR is via conda (https://docs.conda.io/en/latest/) which is a great tool for managing R/Py/Perl pkgs and other bioinformatics tools.
R pkgs can be converted into conda pkgs fairly easily if they are on CRAN/Bioconductor. Things get a bit trickier if they're only on GitHub, especially if their dependencies are also GitHub-only.
From CNAqc's DESCRIPTION I can see multiple caravagn/caravagnalab dependencies (which is perfectly fine, I do that a lot too!). But these would be required to be firstly tagged and released, and then condarised themselves. There are probably other dependencies in there that would also need to be condarised. This is an arduous process and frankly not worth the effort for one plot.
We are currently handling the installation of CNAqc into our conda environments simply by using remotes::install_github("caravagnalab/CNAqc"). This adds an extra 15-20min to our GitHub Actions pipeline, but works fine on Linux (but not MacOS). It's okay for now, but we will be looking into replacing this at some point.

@caravagn
Copy link
Collaborator

Hi @pdiakumis, I see, thanks for clarifying, I can then close this.

Then maybe the best for you is to re-use our code from

Best,

Giulio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants