Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gapseq usage #1

Open
arianccbasile opened this issue Jun 30, 2022 · 3 comments
Open

gapseq usage #1

arianccbasile opened this issue Jun 30, 2022 · 3 comments

Comments

@arianccbasile
Copy link

Hello :)
Very nice paper, congratulations.
I've just one question, in the manuscript you stated that you used gapseq to "facilitate the taxonomic and functional identification of core and rare species from shotgun metagenomic sequencing data and reference genomes with omission rates". Can I ask you what you did exactly? It is not completely clear to me.

Best,
Arianna Basile

@mmpust
Copy link
Owner

mmpust commented Jun 30, 2022

Dear Arianna,
Thank you for your question and I am sorry the sentence was not clear enough!
So, we used the raspir tool to filter microbial taxa from our metagenomics patient samples after reference-based alignment. Raspir enabled us to also include low abundance taxa, which are otherwise typically discarded. After this filtering step, we selected the reference genomes of the remaining species and sent those reference sequences into the gapseq pipeline, allowing us to investigate the functional and metabolic repertoire of these reference genomes as well.
Does this answer your question?
best wishes,
Marie

@arianccbasile
Copy link
Author

arianccbasile commented Jul 1, 2022

Thank you Marie for your quick answer. Now it is clearer, thank you.
Just to be sure, you used gapseq to find pathways and transporters but without running any simulation with the metabolic reconstructions obtained, right?

Best,
Arianna

@mmpust
Copy link
Owner

mmpust commented Jul 1, 2022

Dear Arianna,
yes, exactly. We did neither perform metabolic reconstruction nor balanced flux analysis. But gapseq calculates a completeness score per pathway per reference genome, which is very powerful. We then just extracted this scoring information for the downstream data analysis. Though, it would have been much more powerful to directly assemble bacterial genomes from patient samples and then do the metabolic modeling directly with MAGs from patient samples. But we would have just captured the high abundance taxa with this approach and in this publication, we were particularly interested in the contribution of the low abundance taxa.
Best wishes,
Marie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants