Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New tree plot #103

Open
wants to merge 21 commits into
base: dev
Choose a base branch
from
Open

New tree plot #103

wants to merge 21 commits into from

Conversation

FernandoDuarteF
Copy link
Collaborator

@FernandoDuarteF FernandoDuarteF commented Dec 9, 2024

No Issue associated to this PR.

Changes

  1. Added a new tree plot method. Now both Quast and BUSCO outputs are plotted next to the tree.
  • The script outputs the "complete" tree (BUSCO+Quast), the "complete no lenged" tree, only the tree, and only the legend, all in pdf format. Only the "complete" and the "complete no legend" trees are published in the results folder. Might be a good idea to also output separate plots for each BUSCO and Quast.
  1. Changed the container for this plot, as new plotting method now requires ggtreeExtra from bioconductor.

Comments

For now the new container is in my quay.io repository, as I have not been able to push changes in our docker repository (see comment below).

There was a cropping issue related to Quast barplots in the new plotting method, I managed to fixed it using an extremely inelegant solution. It works, at least for now.

Perhaps a bin folder cleaning is necessary, but I left old scripts there for now.

@FernandoDuarteF
Copy link
Collaborator Author

This is the error message I get when trying to push changes in .github/workflows/push-genomeqc_tree.yml into the docker-build repository:

% git push origin new_genomeqc_tree
Enumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Delta compression using up to 11 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (7/7), 613 bytes | 613.00 KiB/s, done.
Total 7 (delta 4), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
To https://github.com/Eco-Flow/docker-build.git
 ! [remote rejected] new_genomeqc_tree -> new_genomeqc_tree (refusing to allow a Personal Access Token to create or update workflow `.github/workflows/push-genomeqc_tree.yml` without `workflow` scope)

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Dec 11, 2024

I had to make changes in the gene_overlaps.R script so that genes with undefined strands ("." and "?") in the gff file are ignored.

It's impossbile to know in which strand these genes overlap with others, so in this context they are not very informative .

I'll open a new issue, as there might be a better way to tackle this.

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Jan 13, 2025

Recent changes

  1. ggtreeExtra package is not necessary anymore. Everything is done using patchworks and ggtree.
  2. Added gene stats, sequence number and genome size plots.
  3. Updated Dockerfile (still in my own repo).

Comments

I need to include some parameters in the config related to the tree summary plot, such as which stats to plot and in which order, the width of the bars, font size, x axis limits of the tree plot, etc. But first I need to add these as script arguments.

There is a module for getting the number of chromosomes using grep, which I will remove eventually in favour of the number of contigs stat from Quast.

I laso added a formula to avoid the cropping of tree tip labels.

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Jan 22, 2025

I added the --tree_scale argument in the tree plot script to control the truncation of the tree tip labels (default value is 0.0005, lightly increase if labels are truncated), and --bar_width and --rad_width to control for the radius of the bar plots and pies charts respectively.

@FernandoDuarteF
Copy link
Collaborator Author

Tube map is in low resolution, will change it later.

@FernandoDuarteF
Copy link
Collaborator Author

I added the --tree_scale argument in the tree plot script to control the truncation of the tree tip labels (default value is 0.0005, lightly increase if labels are truncated), and --bar_width and --rad_width to control for the radius of the bar plots and pies charts respectively.

I thought I added this, but actually I didn't. Now I've added the --tree_scale parameter which can be parsed as a command line argument.

Need to add the rest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant