Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to annotate variants #28

Closed
pbousquets opened this issue Oct 6, 2023 · 4 comments
Closed

Unable to annotate variants #28

pbousquets opened this issue Oct 6, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@pbousquets
Copy link

pbousquets commented Oct 6, 2023

Hello,

I've been using CNAqc for a while with no problems. However, I just tested the function annotate_variants for the first time and found that it crashes before it can end.

There seems to be an issue with reference patches, though I made sure I have no patches in my input. They might be appearing from an internal object generated during the annotation.

This issue was encountered while analyzing two WGS (60x) tumor-normal pairs, and it comes out with both of them

✔ Preparing mutations ... done
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
✔ Locating variants with VariantAnnotation ... done
✔ Traslating Entrez ids ... done
✔ Transforming data ... done
                    
── Coding substitutions found 
✔ Predicting coding ... done
✔ Drivers annotation ... done
`summarise()` has grouped output by 'chr', 'from'. You can override using the `.groups` argument.
Warning messages:
1: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 953 out-of-bound ranges located on sequences 21230, 21233, 11270, 23205, 35581, 35582, 35583, 35586, 35587, 27016, 27020,
  37496, 37501, 37502, 37511, 37930, 39656, 43370, 43382, 43383, 43384, 43385, 47481, 47482, 48633, 61078, 74576, 81614, 81615, 82016, 90988, 90989,
  97803, 92468, 98922, 98923, 98924, 98926, 93172, 93173, 99190, 99191, 99192, 99193, 96033, 96034, 96039, 96040, 96051, 96053, 96054, 96055, 96059,
  102446, 102447, 102450, 102451, 102455, 102457, 102458, 102459, 102460, 102463, 102464, 102465, 102997, 102998, 103174, 97597, 97601, 103907, 103908,
  109272, 116018, 116019, 116020, 116023, 116026, 116028, 134306, 134307, 150288, 152494, 159655, 163110, 163111, 163112, 163113, 163114, 163116, 163117,
  164992, 170986, 170987, 170988, 175497, 183016, 184164, 184178, 184787, 184788, 184789, 184790, 184792, 184796, 184800, 184801, 184802, 184803, 184808,
  184816, 184824, 184825, 184826, 191063, 192095, 192096, 205883, 210788, 217649, 221435, 230411 [... truncated]
2: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 222 out-of-bound ranges located on sequences chr1_GL383518v1_alt, chr1_KI270762v1_alt, chr2_GL383522v1_alt,
  chr2_KI270774v1_alt, chr3_KI270777v1_alt, chr3_KI270781v1_alt, chr4_GL000257v2_alt, chr4_KI270788v1_alt, chr5_GL339449v2_alt, chr5_KI270795v1_alt,
  chr5_KI270898v1_alt, chr6_GL000250v2_alt, chr6_GL000254v2_alt, chr6_KI270797v1_alt, chr6_KI270798v1_alt, chr6_KI270801v1_alt, chr7_GL383534v2_alt,
  chr7_KI270803v1_alt, chr7_KI270806v1_alt, chr7_KI270809v1_alt, chr8_KI270815v1_alt, chr9_GL383540v1_alt, chr9_GL383541v1_alt, chr9_GL383542v1_alt,
  chr9_KI270823v1_alt, chr10_GL383546v1_alt, chr11_KI270831v1_alt, chr11_KI270902v1_alt, chr12_GL383551v1_alt, chr12_GL383553v2_alt,
  chr12_KI270834v1_alt, chr13_KI270838v1_alt, chr14_KI270847v1_alt, chr15_KI270848v1_alt, chr15_KI270850v1_alt, chr15_KI270851v1_alt,
  chr15_KI270906v1_alt, chr16_GL383556v1_alt, chr16_GL383557v1_alt, chr16_KI270854v1_alt, chr17_JH159146v1_alt, chr17_JH159147v1_alt,
  chr17_KI270857v1_a [... truncated]
3: In UseMethod("depth") :
  no applicable method for 'depth' applied to an object of class "NULL"
4: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 945 out-of-bound ranges located on sequences 21230, 21233, 11270, 23205, 35581, 35582, 35583, 35586, 35587, 27016, 27020,
  37496, 37501, 37502, 37511, 37930, 39656, 43370, 43382, 43383, 43384, 43385, 47481, 47482, 48633, 61078, 74576, 81614, 81615, 82016, 90988, 90989,
  97803, 92468, 98922, 98923, 98924, 98926, 93172, 93173, 99190, 99191, 99192, 99193, 96033, 96034, 96039, 96040, 96051, 96053, 96054, 96055, 96059,
  102446, 102447, 102450, 102451, 102455, 102457, 102458, 102459, 102460, 102463, 102464, 102465, 102997, 102998, 103174, 97597, 97601, 103907, 103908,
  109272, 116018, 116019, 116020, 116023, 116026, 116028, 134306, 134307, 150288, 152494, 159655, 163110, 163111, 163112, 163113, 163114, 163116, 163117,
  164992, 170986, 170987, 170988, 175497, 183016, 184164, 184178, 184787, 184788, 184789, 184790, 184792, 184796, 184800, 184801, 184802, 184803, 184808,
  184816, 184824, 184825, 184826, 191063, 192095, 192096, 205883, 210788, 217649, 221435, and 23 [... truncated]
5: In dplyr::left_join(loc_df, output_coding, by = c("chr", "from",  :
  Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1982 of `x` matches multiple rows in `y`.
ℹ Row 3 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship = "many-to-many"` to silence this warning.
@caravagn
Copy link
Collaborator

Thanks, tagging @Militeee that developed this bits of code.

@caravagn caravagn added the bug Something isn't working label Oct 14, 2023
@Militeee
Copy link
Contributor

Hi @pbousquets ,

I had a look at the function at there was indeed a bug that resulted in an empty join. We changed the actual representation of SNVs from to - from = 1 (so length 1 interval) to to - from = 0 (so length 0 interval) . I pushed a fix.
I don't know if it also solves your problem, but it is worth trying to reinstall the package and rerun it.

In case you still have trouble with the function, it would be extremely useful if you could provide an example dataset to reproduce the error, I'll try to fix it asap.

Just a last thing, driver annotation is generally hard, this function gives you a spotlight on coding (non-synonymous, stop gain and frameshift mainly) mutations in cancer genes, but you will need more sophisticated approaches to actually call putative drivers among them.

Cheers,
S.

@pbousquets
Copy link
Author

Hi @Militeee,

Thank you very much for having a look at it. I'll give it a try and let you know if it worked ASAP. If the new version didn't work I'll send you a dataset to reproduce the problem. Thank you very much!

Pablo

@pbousquets
Copy link
Author

Hi again, @Militeee ,

I just tested the bugfix and it perfectly worked. Thank you very much for your quick help!

Cheers,
P.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants