Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request - add types to VCF #1604

Open
colindaven opened this issue Feb 3, 2025 · 2 comments
Open

feature request - add types to VCF #1604

colindaven opened this issue Feb 3, 2025 · 2 comments

Comments

@colindaven
Copy link

colindaven commented Feb 3, 2025

Hi,

thanks for this nice tool (I'm using minigraph-cactus).

To aid filtering and further downstream use, would it be possible to add types to the VCF ?

Example: INFO field (SVTYPE=DEL or SVTYPE=INS or SVTYPE=INV or SVTYPE=BND

While the above request might be reasonable, I assume adding SVLEN would be impossible due to the different lengths of the nodes in the different pangenomic samples. But lets say we only want SVs and no SNPs in a filtered VCF - how would you go about this despite missing all these tags. Could SVLEN be computed for tiny SNPs/indels only. Or could SVLEN 60 be added to all larger indels ?

That all sounds very hacky, but just thinking out loud. Perhaps there are better solutions for filtering VCFs based on sequence length in Ref and Alt sequence columns that I am not aware of, but this seems like a common task.

Thanks

@glennhickey
Copy link
Collaborator

I'm with you. I think there is a vg issue or two on this (it's vg deconstruct that would need to be updated). It's slightly complicated by the fact that SVs often come out of the graph looking rather messy -- ie include both indels and snps in the same variant. But we should make more of an effort to write these tags, especially in the more unambiguous cases.

In the meantime, you can still use bcftools view to filter. For example, if you want SV insertions you might try something like bcftools view -i 'STRLEN(ALT) - STRLEN(REF) >= 50' etc.

@colindaven
Copy link
Author

Excellent - thanks Glenn for the info and the workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants