Input query was recognized as database #113

cv-1993 · 2025-01-24T07:48:00Z

Hi Jaebeom Kim,

Thanks for developing Metabuli.

I've just installed Metabuli version 1.0.9.2 using Mamba environment and downloaded the pre-built database refseq 224.
However, when I tried to run Metabuli, it seemed like the second read was recognized as the database. Please see the detail below:

Here is the command: metabuli classify rawReads/L017_1.fastq.gz rawReads/L017_2.fastq.gz databases/Metabuli/refseq224 metabuli --threads 90 --max-ram 700

Here is the output:

MMseqs Version:                                         1.0.9.2
Threads                                                 90
Sequencing type                                         2
Min. sequence similarity score                          0
Min. query coverage                                     0
Min. num. of cons. matches for non-euk. classification  4
Min. num. of cons. matches for euk. classification      9
Min. score for species- or lower-level classification.  0
Allowed extra Hamming distance                          0
Directory where the taxonomy dump files are stored
Mask residues                                           0
Mask residues probability                               0.9
RAM usage in GiB                                        700
Number of matches per query k-mer.                      4
Accession-level DB build/search                         0
Best * --tie-ratio is considered as a tie               0.95
Not storing k-mer's redundancy. Keep it as 1.           0
Print lineage information                               0

Input database "rawReads/L017_2.fastq.gz" has the wrong type (Generic)
Allowed input:
- Directory

Thanks for your help to solve this issue.

The text was updated successfully, but these errors were encountered:

jaebeom-kim · 2025-01-24T07:49:51Z

Please add a job name after outdir (matabuli in your case). I hope it solves the problem. 2025년 1월 24일 (금) 오후 4:48, cv-1993 ***@***.***>님이 작성:

…

Hi Jaebeom Kim, Thanks for developing Metabuli. I've just installed Metabuli version 1.0.9.2 using Mamba environment and downloaded the pre-built database refseq 224. However, when I tried to run Metabuli, it seemed like the second read was recognized as the database. Please see the detail below: Here is the command: metabuli classify rawReads/L017_1.fastq.gz rawReads/L017_2.fastq.gz databases/Metabuli/refseq224 metabuli --threads 90 --max-ram 700 Here is the output: MMseqs Version: 1.0.9.2 Threads 90 Sequencing type 2 Min. sequence similarity score 0 Min. query coverage 0 Min. num. of cons. matches for non-euk. classification 4 Min. num. of cons. matches for euk. classification 9 Min. score for species- or lower-level classification. 0 Allowed extra Hamming distance 0 Directory where the taxonomy dump files are stored Mask residues 0 Mask residues probability 0.9 RAM usage in GiB 700 Number of matches per query k-mer. 4 Accession-level DB build/search 0 Best * --tie-ratio is considered as a tie 0.95 Not storing k-mer's redundancy. Keep it as 1. 0 Print lineage information 0 Input database "rawReads/L017_2.fastq.gz" has the wrong type (Generic) Allowed input: - Directory Thanks for your help to solve this issue. — Reply to this email directly, view it on GitHub <#113>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQK2QJNEHHUWE7LZ7W4YKUT2MHV4NAVCNFSM6AAAAABVZCWL4KVHI2DSMVQWIX3LMV43ASLTON2WKOZSHAYDQNZXHAZDOOA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

cv-1993 · 2025-01-24T09:28:10Z

@jaebeom-kim Thanks for a quick response.
I've added jobid: metabuli classify rawReads/L017_1.fastq.gz rawReads/L017_2.fastq.gz databases/Metabuli/refseq224 metabuli --threads 90 --max-ram 700

However, I still got the same error:
`MMseqs Version: 1.0.9.2
Threads 90
Sequencing type 2
Min. sequence similarity score 0
Min. query coverage 0
Min. num. of cons. matches for non-euk. classification 4
Min. num. of cons. matches for euk. classification 9
Min. score for species- or lower-level classification. 0
Allowed extra Hamming distance 0
Directory where the taxonomy dump files are stored
Mask residues 0
Mask residues probability 0.9
RAM usage in GiB 700
Number of matches per query k-mer. 4
Accession-level DB build/search 0
Best * --tie-ratio is considered as a tie 0.95
Not storing k-mer's redundancy. Keep it as 1. 0
Print lineage information 0

Input database "./rawReads/L017_2.fastq.gz" has the wrong type (Generic)
Allowed input:

Directory`

jaebeom-kim · 2025-01-24T09:48:06Z

Your command is the same as before. Could you add one more text like L017 next to between "metabuli" and "--threads"? 2025년 1월 24일 (금) 오후 6:28, cv-1993 ***@***.***>님이 작성:

…

@jaebeom-kim <https://github.com/jaebeom-kim> Thanks for a quick response. I've added jobid: metabuli classify rawReads/L017_1.fastq.gz rawReads/L017_2.fastq.gz databases/Metabuli/refseq224 metabuli --threads 90 --max-ram 700 However, I still got the same error: `MMseqs Version: 1.0.9.2 Threads 90 Sequencing type 2 Min. sequence similarity score 0 Min. query coverage 0 Min. num. of cons. matches for non-euk. classification 4 Min. num. of cons. matches for euk. classification 9 Min. score for species- or lower-level classification. 0 Allowed extra Hamming distance 0 Directory where the taxonomy dump files are stored Mask residues 0 Mask residues probability 0.9 RAM usage in GiB 700 Number of matches per query k-mer. 4 Accession-level DB build/search 0 Best * --tie-ratio is considered as a tie 0.95 Not storing k-mer's redundancy. Keep it as 1. 0 Print lineage information 0 Input database "./rawReads/L017_2.fastq.gz" has the wrong type (Generic) Allowed input: - Directory` — Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQK2QJMWNO53QV6XMK3BJPL2MIBUDAVCNFSM6AAAAABVZCWL4KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJSGA2TQMZWGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

cv-1993 · 2025-01-27T02:04:03Z

@jaebeom-kim
Thanks. The run started after adding JobID.
However, the run crashed after ~10min without any error. It seemed to require higher computational power.
We have 750 GB in RAM and 96 CPU cores. Isn't it sufficient?

Thanks.

jaebeom-kim · 2025-01-27T02:47:17Z

Your computing resource is enough. Could you send me all the printed logs? It helps me find where the error happens.

cv-1993 · 2025-01-27T05:37:43Z

@jaebeom-kim

classify ./rawReads/L017_1.fastq.gz ./rawReads/L017_2.fastq.gz databases/Metabuli/refseq224 metabuli L017 --threads 90 --max-ram 700

Metabuli Version (commit):                              1.0.9.2
Threads                                                 90
Sequencing type                                         2
Min. sequence similarity score                          0
Min. query coverage                                     0
Min. num. of cons. matches for non-euk. classification  4
Min. num. of cons. matches for euk. classification      9
Min. score for species- or lower-level classification.  0
Allowed extra Hamming distance                          0
Directory where the taxonomy dump files are stored
Mask residues                                           0
Mask residues probability                               0.9
RAM usage in GiB                                        700
Number of matches per query k-mer.                      4
Accession-level DB build/search                         0
Best * --tie-ratio is considered as a tie               0.95
Not storing k-mer's redundancy. Keep it as 1.           0
Print lineage information                               0

DB name: refseq_release_224_prokaryote_virus_human
DB creation date: 2024-9-27
Loading the list for taxonomy IDs ... Done
Indexing query file ...Done
Total number of sequences: 67123704
Total read length: 20271358608nt
Extracting query metamers ...
Time spent for metamer extraction: 28
Sorting query metamer list ...
Time spent for sorting query metamer list: 14
Comparing query and reference metamers...
--match-per-kmer was increased to 8 and searching again...
Extracting query metamers ...
Time spent for metamer extraction: 17
Sorting query metamer list ...
Time spent for sorting query metamer list: 10
Comparing query and reference metamers...
--match-per-kmer was increased to 12 and searching again...
Extracting query metamers ...
Time spent for metamer extraction: 11
Sorting query metamer list ...
Time spent for sorting query metamer list: 6
Comparing query and reference metamers...

Up to this point, the run crashed and the terminal just shut down.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input query was recognized as database #113

Input query was recognized as database #113

cv-1993 commented Jan 24, 2025

jaebeom-kim commented Jan 24, 2025 via email

cv-1993 commented Jan 24, 2025

jaebeom-kim commented Jan 24, 2025 via email

cv-1993 commented Jan 27, 2025

jaebeom-kim commented Jan 27, 2025

cv-1993 commented Jan 27, 2025 •

edited

Loading

Input query was recognized as database #113

Input query was recognized as database #113

Comments

cv-1993 commented Jan 24, 2025

jaebeom-kim commented Jan 24, 2025 via email

cv-1993 commented Jan 24, 2025

jaebeom-kim commented Jan 24, 2025 via email

cv-1993 commented Jan 27, 2025

jaebeom-kim commented Jan 27, 2025

cv-1993 commented Jan 27, 2025 • edited Loading

cv-1993 commented Jan 27, 2025 •

edited

Loading