Bowtie2 (bad greedy) and read multimapping for metagenomes #9

TealFurnholm · 2020-11-12T16:44:03Z

Since this is designed for a meta-NGS data set - and Bowtie2 is not (he says so in his manual).

BT2 is a greedy matcher = very low %ID matches will still be reported, it was designed for a single eukaryote genome read alignment, with splicing and SNPs and optimized to find the first best hit
BT2 is incomprehensible in its manual to try and adjust to something similar to a %ID
75% of all bacterial genes are orthologs - I curated the entire NCBI+JGI's 529 million genes, I know - and metagenomes are replete with many strains from the same species == you have to multimap the reads.

Instead of Bowtie2, I ran BBmap with 95% ID either with or without multimapping using MEC
(since I still haven't gotten DeepMased to work: see other reported issue)

no multimapping (random assign read to one of the best hits): #split_num 741
with read multimapping: #split_num 5322

You can see there is quite a difference - and I think you'll find the same with DeepMased.
Orthology/multimapping is a major issue. You may find quite a bit more than 1% chimeras!
Please trust me and check it out.

I plan to check results with MetaQuast to see which is correct, once I get DeepMAsED working.

The REAL question is what will your software do if I feed it a bam file with multimapped read?

Best,
Teal

nick-youngblut · 2020-11-12T18:41:40Z

We have not assessed the influence of read mapper on the accuracy of DeepMAsED: both for training and also for using an existing model trained with bowtie2 mapping of reads to contigs. It would be interesting to see how the read mapper affects misassembly identification. The challenge is what ground truth would you use for "true" mappings and how mapping accuracy affects identification of "true" misassemblies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bowtie2 (bad greedy) and read multimapping for metagenomes #9

Bowtie2 (bad greedy) and read multimapping for metagenomes #9

TealFurnholm commented Nov 12, 2020

nick-youngblut commented Nov 12, 2020

Bowtie2 (bad greedy) and read multimapping for metagenomes #9

Bowtie2 (bad greedy) and read multimapping for metagenomes #9

Comments

TealFurnholm commented Nov 12, 2020

nick-youngblut commented Nov 12, 2020