Requirements
- MUMmer v4.0.0beta2
- FastTreeMP v2.1.10
You should be able to run the following from your command line:
nucmer
FastTreeMP
Locate viral genomes for an individual viral OTU and place these into a directory. Choose one of these genomes to be the representative
GENOMES=my_genomes
REP=my_genomes/representative.fna
Align all genomes to a selected representative
python align_genomes.py --genomes $GENOMES --ref $REP --out alignments
Use SNPs to build multiple sequence alignment
python build_msa.py --in alignments --out snps.fna --max_gaps_col 50 --max_gaps_seq 50
Use FastTree to contruct phylogeny
FastTreeMP snps.fna > snps.tree
Pipeline overview
- Use
nucmer
utility in MUMmer4 to align all genomes to a reference - Call SNPs in 1:1 alignment blocks using
show-snps, show-coords, and show-diff
utlilties - Use alignments to create multiple sequence alignments against reference
- Remove genomic sites covered in <50% of genomes and non-polymorphic sites (i.e. non-SNPs)
- Remove genomes containing >50% gaps
- Use
FastTree
to create phylogeny from trimmed multiple sequence alignment