You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for this nice tool (I'm using minigraph-cactus).
To aid filtering and further downstream use, would it be possible to add types to the VCF ?
Example: INFO field (SVTYPE=DEL or SVTYPE=INS or SVTYPE=INV or SVTYPE=BND
While the above request might be reasonable, I assume adding SVLEN would be impossible due to the different lengths of the nodes in the different pangenomic samples. But lets say we only want SVs and no SNPs in a filtered VCF - how would you go about this despite missing all these tags. Could SVLEN be computed for tiny SNPs/indels only. Or could SVLEN 60 be added to all larger indels ?
That all sounds very hacky, but just thinking out loud. Perhaps there are better solutions for filtering VCFs based on sequence length in Ref and Alt sequence columns that I am not aware of, but this seems like a common task.
Thanks
The text was updated successfully, but these errors were encountered:
I'm with you. I think there is a vg issue or two on this (it's vg deconstruct that would need to be updated). It's slightly complicated by the fact that SVs often come out of the graph looking rather messy -- ie include both indels and snps in the same variant. But we should make more of an effort to write these tags, especially in the more unambiguous cases.
In the meantime, you can still use bcftools view to filter. For example, if you want SV insertions you might try something like bcftools view -i 'STRLEN(ALT) - STRLEN(REF) >= 50' etc.
Hi,
thanks for this nice tool (I'm using minigraph-cactus).
To aid filtering and further downstream use, would it be possible to add types to the VCF ?
Example: INFO field (SVTYPE=DEL or SVTYPE=INS or SVTYPE=INV or SVTYPE=BND
While the above request might be reasonable, I assume adding
SVLEN
would be impossible due to the different lengths of the nodes in the different pangenomic samples. But lets say we only want SVs and no SNPs in a filtered VCF - how would you go about this despite missing all these tags. Could SVLEN be computed for tiny SNPs/indels only. Or could SVLEN 60 be added to all larger indels ?That all sounds very hacky, but just thinking out loud. Perhaps there are better solutions for filtering VCFs based on sequence length in Ref and Alt sequence columns that I am not aware of, but this seems like a common task.
Thanks
The text was updated successfully, but these errors were encountered: