-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better global alignment when aligning in other direction #78
Comments
But I do agree that they are not real optimal alignment results. |
Thanks for the quick follow-up. By eye it still seems that the reverse with seeding is the best. I understand that the difference between the different scenarios is explainable by the order, and it's not reflected in the current scoring scheme. I'm still not sure I understand the difference when aligning the different strands -- shouldn't the order be unaffected? In any case, it does seem like there is room for future improvements -- we are happy to test any ideas you come up with! |
The difference between different strands is because abpoa always puts gaps in the left-most position. |
Hi @glennhickey again, I am adding a parameter for abpoa to deal with this type of homopolymer sequence alignment to reach better visual alignment results. I also encountered this type of issue during my project. Cheers! |
Apart from what I've shared in github issues here, we have a couple small simulated tests in Cactus https://github.com/UCSantaCruzComputationalGenomicsLab/cactusTestData where we use mafComparator to compare to the provided truth MAF. If you end up making a significant change, I should be able to plug it into Cactus and, say, make a new pangenome graph and measure some stats on that... |
Is there any specific score parameters/matrix I should use for this data? |
Hmm, that data's probably best with the (current) default cactus scores, ex |
Hi @glennhickey , I did come up with some heuristics for improved graph alignment. Would be great if you can have some comments on that. |
@adamnovak has been picking through the HPRC graph and finding suspect alignments. Here is one from
CHM13#0#chr3:164033777-164033842
. If I align it with abpoa in its forward orientation (on chm13) I get(where gaps are transparent).
But if I reverse complement I get
which seems much cleaner -- ie there is only 1 gap per row except 3 cases, where the gap seems more properly placed on the right.
Are these alignment somehow scoring equivalently, even though by eye one seems much better? If not, is this expected or a bug? Do you have any suggestions on how it could be improved?
All the information to reproduce is here (see README for command lines):
https://public.gi.ucsc.edu/~hickey/debug/abpoa_direction_oct17_2024/
Thanks so much!
The text was updated successfully, but these errors were encountered: