You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As discussed in #170 I'm suggesting to get rid of --ncbi_assembly_metadata requirement and obtain relevant assemblies directly based on assembly IDs.
Below I provide python3 script that is able to download assemblies based on their accession (using NCBI's API).
At this moment the script downloads fasta, gff and gbff (for my convenience), this can be adjusted based on the bacass needs.
Possible interfaces:
python import of download function
cli (2 modes for my convenience, can be easily simplified)
dependencies:
urllib3 (probably may be rewritten for urllib)
result:
obtained data accessible under [target dir]/ncbi_dataset/data/ (can be adjusted at the cost of added complexity)
know limitations:
works well with low to medium number of assemblies; personally, I would keep this under 50 per request. Reasonable numbers can be handled by the script (chunk iter). But if we would aim for even larger numbers (thousands) than I would advice to use ncbi-datasets-cli (available e.g. from conda).
Description of feature
As discussed in #170 I'm suggesting to get rid of
--ncbi_assembly_metadata
requirement and obtain relevant assemblies directly based on assembly IDs.Below I provide
python3
script that is able to download assemblies based on their accession (using NCBI's API).At this moment the script downloads
fasta
,gff
andgbff
(for my convenience), this can be adjusted based on thebacass
needs.Possible interfaces:
download
functiondependencies:
urllib3
(probably may be rewritten forurllib
)result:
[target dir]/ncbi_dataset/data/
(can be adjusted at the cost of added complexity)know limitations:
ncbi-datasets-cli
(available e.g. from conda).If you are interested in this, could you please check, if
urllib3
is available inbacass
python3
?dbkref.py.zip
The text was updated successfully, but these errors were encountered: