Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsnp adding SNP and different SNP calls between runs #112

Open
robfenton opened this issue Apr 28, 2022 · 2 comments
Open

parsnp adding SNP and different SNP calls between runs #112

robfenton opened this issue Apr 28, 2022 · 2 comments

Comments

@robfenton
Copy link

We have 41 complete, closed, Salmonella sequences (40 unique). For a
reference to get annotations, we exported one (1956) as both a genbank file,
and a fasta file. Looking at the output in gingr (see screenshots) there
have been several nucleotides changed from what is in the genbank and fasta.

After verifying the sequences involved, we re-ran the analysis with no
changes, and received a slightly different output (see screenshots, isolate
09578-19-1).

1956_Chromosome.gb contains the GenBank record for the chromosome, and 1956.fasta contains both the chromosome and plasmid
sequences. The chromosome sequences from both files is identical.

Using 1.7.2 installed via conda
Ubuntu 20.04 LTS in VirtualBox on Windows 10.
ginger 1.3 from the Linux64-v1.3 tarball
GenBank and fasta files all produced by Geneious 2022.1.1
Annotated with PGAP (2021-01-11.build5132)

The only options used were -p 8, -g genbankfile.gb, and -d ./oursequences

Screenshot from 2022-04-28 11-05-09
Screenshot from 2022-04-28 15-23-40

@bkille
Copy link
Contributor

bkille commented Apr 30, 2022

Hi @robfenton! Thank you for opening an issue and also for detailing the version, command, and OS. Just to clarify, the only difference I am seeing is that sequence 09578-19-1.fasta seems to be missing all of the variants in the first figure.

In order to to track this bug now, we'll first need to determine if it came from parsnp, harvest-tools, or gingr. Would you be able to share the .xmfa output of your analysis? You're welcome to email the output to [email protected] as opposed to posting it here if you'd prefer.

@robfenton
Copy link
Author

That's not quite it. 1956.fasta and 1956_Chromosome.gb.fna show different sequences, even though they're supposed to be exactly the same.

I'll double check with my boss on Monday and send both .xmfa output files if I can.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants