database choice #49

tijeco · 2022-05-27T21:49:05Z

Database choice

The current version doesn't seem to necessarily allow for choosing which database to download / use as mentioned in #48 , so I have drafted this PR.

My goal is that the database used would be explicitly declared, so each database has its own flag added to the arguments.json file for run and download. The idea being that to just download mitelman, you could run fusion_report download --use_mitelman true database_output if you just wanted to download that one database, or any combination of --use_cosmic, --use_mitelman, --use_fusiongdb and --use_fusiongdb2. For my purposes, I only wanted to download mitelman, fusiongdb and fusiongdb2. So I can now run the following:

fusion_report download  --use_mitelman true --use_fusiongdb true --use_fusiongdb2 true fusionreport_download

Further, to run on the test dataset, I can use the following:

fusion_report run "test" test_output fusionreport_download/ \
  --use_mitelman true --use_fusiongdb true --use_fusiongdb2 true \
  --arriba tests/test_data/arriba.tsv \
  --dragen tests/test_data/dragen.tsv \
  --ericscript tests/test_data/ericscript.tsv \
  --fusioncatcher tests/test_data/fusioncatcher.txt \
  --pizzly tests/test_data/pizzly.tsv \
  --squid tests/test_data/squid.txt \
  --starfusion tests/test_data/starfusion.tsv \
  --jaffa tests/test_data/jaffa.csv \
  --allow-multiple-gene-symbols

I also included a conda environment file, which I included as I used it with a jupyter notebook to play around with the library, so I thought it might be useful as well.

Let me know what you think.

Checklist

Specify in detail the change
Make sure to follow guidelines in docs when adding database/tool
Documentation in docs is updated
CHANGELOG.md is updated
README is updated

matq007 · 2022-06-09T09:33:57Z

Hi @tijeco, I understand why would you prefer to choose your own databases. We made it initially with idea of using all of them because otherwise you have to specify a weight for each database separately.

rannick · 2024-10-04T07:54:30Z

It is really nice, I implemented similar options, just from the negative, so you would need to specify the databases you don't want instead of the ones you want.

rannick · 2024-10-04T07:54:33Z

#77

Jeff Cole added 3 commits May 27, 2022 15:35

conda environment file

dc9f8da

add arguments to choose database to download

199035d

run with database choice

58c46a0

rannick closed this Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

database choice #49

database choice #49

tijeco commented May 27, 2022 •

edited

Loading

matq007 commented Jun 9, 2022

rannick commented Oct 4, 2024

rannick commented Oct 4, 2024

database choice #49

database choice #49

Conversation

tijeco commented May 27, 2022 • edited Loading

Database choice

Checklist

matq007 commented Jun 9, 2022

rannick commented Oct 4, 2024

rannick commented Oct 4, 2024

tijeco commented May 27, 2022 •

edited

Loading