Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bedtools intersect with -g option: chromosomes without regions cause intersections being missed #1111

Open
ChrissiKalk97 opened this issue Jan 9, 2025 · 0 comments

Comments

@ChrissiKalk97
Copy link

I have used bedtools intersect for genomic files. As these were very large I had to go with the -sorted and the -g option.
The -g file I obtained in the following way:

cut -f1,2 /path/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.fai >
$out_path/genome_chrom_ordering.txt

For the two bedfiles I used the same chromsome ordering for sorting and intersected them in the following way:
bedtools intersect
-s
-wao
-a $bedfile1
-b $bedfile2
-sorted
-g $out_path/genome_chrom_ordering.txt
| sort -nr -k13,13
> $interesectBedfile

This seemed to work, however I noticed after a while that I could not find any intersections for the X and Y chromosome which was weird. It turned out that one of my bedfiles did not have a region for the MT chromosome (being in front of X for the chromosome ordering). This resulted in intersections after the MT chromosome to be missed. I then included dummy regions in the bedfile missing the MT chromosome (null regions) and now the intersection is working also for the X and Y chromosome. I do not think that this behavior is desired, so I report my observations here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant