Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameters for large plant genome #202

Open
danielle-khost opened this issue Oct 31, 2023 · 1 comment
Open

Parameters for large plant genome #202

danielle-khost opened this issue Oct 31, 2023 · 1 comment

Comments

@danielle-khost
Copy link

Hi there!
I was looking for some feedback using wfmash to align some large (>6gb) plant genomes that are super repeat-dense, around 90%. I was able to make an alignment between species successfully (though it took quite a bit of RAM, around 500Gb), however even for my best assembly I was only able to align around 600Mb of sequence. I was wondering if there were any parameters or setting you could recommend to improve alignment?
Currently I had experimented with the -s and -c settings, oddly increasing -s resulted in less alignment than default. Increasing the -c parameter to 100k seemed to give me my best assembly, though I am not sure if setting it that high is a good idea?

Thanks for any help you can give! These genomes are quite cumbersome, so it might be that this alignment is the best I can manage :)

-Danielle

@ekg
Copy link
Collaborator

ekg commented Oct 31, 2023

Hi @danielle-khost! What alignment parameters did you give wfmash?

I would try this: wfmash -p 70 -m to get mappings. See how much length they cover. You can take a look at these and either re-run or feed them into wfmash with -i.

Once the mappings make sense and are sufficiently sensitive, you can go to align them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants