-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
16 changed files
with
1,606 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
|
||
VCF2Dis: A new simple and efficient software to calculate p-distance matrix based Variant Call Format | ||
|
||
|
||
1) Introduction | ||
------------ | ||
|
||
This software relies on two other library packages [zlib] | ||
|
||
---------------------- zlib infomation ---------------------------- | ||
If Lib [zlib] do not work | ||
you can download form this website and install it | ||
http://www.zlib.net/ | ||
|
||
|
||
2) linux/Unix/MacOS INSTALL | ||
-------------------------------------- | ||
|
||
Just execute as follows : | ||
tar -zxvf VCF2DisXXX.tar.gz | ||
cd VCF2DisXXX.tar.gz; | ||
make ; make clean | ||
./bin/VCF2Dis | ||
|
||
#Note: If fail to link,try to re-install the libraries zlib | ||
#if Link do not work ,try Re-install the zlib librarys and copy them to the library Dir | ||
|
||
VCF2Dis-xx/src/include/zlib | ||
|
||
|
||
#step3 : | ||
sh make.sh # or [make && make clean] | ||
|
||
4) Contact | ||
email: [email protected] / [email protected] | ||
join the QQ Group : 125293663 | ||
|
||
|
||
|
||
######################swimming in the sky and flying in the sea ########################### ## | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
CXX=g++ | ||
CXXFLAGS= -g -O2 | ||
BIN := ./bin | ||
LDFLAGS=-lz | ||
INCLUDE=-L./src/zlib/ | ||
all: $(BIN)/VCF2Dis | ||
|
||
$(BIN)/VCF2Dis: $(BIN)/../src/VCF2Dis.o | ||
$(CXX) $^ -o $@ $(LDFLAGS) $(INCLUDE) | ||
|
||
$(BIN)/%.o: %.cpp | ||
$(CXX) -c $(CXXFLAGS) $< -o $@ $(INCLUDE) | ||
|
||
clean: | ||
$(RM) $(BIN)/*.o $(BIN)/../src/*.o |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
|
||
1 Introduction ( VCF2Dis version <1.20) | ||
|
||
To new the p_distance matrix besed the VCF file. the more infomation | ||
about the p_distance matrix ,see this website: | ||
http://evolution.genetics.washington.edu/phylip/doc/distance.html | ||
|
||
The VCF SNPs datasets were used to calculate p-distance between individuals, according to the follow formula to operate the sample i and sample j genetic distance: | ||
|
||
D_ij=(1/L) * [(sum(d(l)_ij))] | ||
|
||
Where L is the length of regions where SNPs can be identified, and given the alleles at position l are A/C: | ||
d(l)_ij=0.0 if the genotypes of the two individuals were AA and AA; | ||
d(l)_ij=0.5 if the genotypes of the two individuals were AA and AC; | ||
d(l)_ij=0.0 if the genotypes of the two individuals were AC and AC; | ||
d(l)_ij=1.0 if the genotypes of the two individuals were AA and CC; | ||
d(l)_ij=0.0 if the genotypes of the two individuals were CC and CC; | ||
|
||
After p_distance done , software PHYLIP 3.69 (http://evolution.genetics.washington.edu/phylip.html) ,with neighbor-joining method can was used to construct the phylogenetic tree on the basis of this p_distance matrix; | ||
PHYLIPNEW-3.69.650/bin/fneighbor -datafile p_dis.matrix -outfile tree.out1.txt -matrixtype s -treetype n -outtreefile tree.out2.tre | ||
The MEGA6 (http://www.megasoftware.net/) was used to present the phylogenetic tree based this file [tree.out2.tre]. | ||
|
||
|
||
2 Install | ||
|
||
Just [make] or [sh make.sh ] to compile this software. | ||
the final software can be found in the Dir [bin/VCF2Dis] | ||
|
||
|
||
3 | ||
|
||
3.1 Parameter description: | ||
Usage: VCF2Dis -InPut <in.vcf> -OutPut <p_dis.mat> | ||
|
||
-InPut <str> Input GATK VCF genotype File | ||
-OutPut <str> OutPut Sample p-Distance matrix | ||
|
||
-SubPop <str> SubGroup SampleList of VCFFile [ALLsample] | ||
-KeepMF Keep the Middle File diff & Use matrix | ||
|
||
-help Show more help [hewm2008 v1.10] | ||
|
||
3.2 To new all the sample p_distance matrix based VCF, run VCF2Dis directly | ||
|
||
./bin/VCF2Dis -InPut in.vcf.gz -OutPut p_dis.mat | ||
|
||
3.3 To new sub group sample p_distance matrix ; Pput their sample name into File sample.list | ||
|
||
./bin/VCF2Dis -InPut in.vcf.gz -OutPut p_dis.mat -SubPop sample.list | ||
|
||
|
||
|
||
4 Discussing | ||
email: [email protected] / [email protected] | ||
join the QQ Group : 125293663 | ||
|
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
#!/usr/bin/perl -w | ||
use strict; | ||
#explanation:this program is edited to | ||
#edit by hewm; Wed Feb 20 11:02:07 HKT 2019 | ||
#Version 1.0 [email protected] | ||
|
||
die "Version 1.0\t2019-02-20;\nUsage: $0 <merge.tre><RepeatTime><boostrap.tre>\n" unless (@ARGV ==3); | ||
|
||
#############Befor Start , open the files #################### | ||
|
||
open (IA,"$ARGV[0]") || die "input file can't open $!"; | ||
my $TotalRepeat=$ARGV[1]; | ||
open (OA,">$ARGV[2]") || die "output file can't open $!" ; | ||
|
||
################ Do what you want to do ####################### | ||
$/=";"; | ||
|
||
while(<IA>) | ||
{ | ||
$_=~s/\n//g; | ||
next if ($_ eq ""); | ||
my $Start=0; | ||
my $Now=$Start; | ||
my $Ttue=1; | ||
my $Str=$_ ; | ||
|
||
while($Ttue==1) | ||
{ | ||
$Now=index($Str,":",$Start); | ||
if ($Now==-1) | ||
{ | ||
$Ttue=0; | ||
} | ||
else | ||
{ | ||
my $Length=$Now-$Start; | ||
my $AAA=substr($Str,$Start,$Length); | ||
$Start=$Now+1; | ||
my $NowA=index($Str,",",$Start); | ||
my $NowB=index($Str,")",$Start); | ||
if ($NowA!=-1 && $NowB!=-1) | ||
{ | ||
if ($NowA>$NowB) | ||
{ | ||
$Now=$NowB; | ||
} | ||
else | ||
{ | ||
$Now=$NowA; | ||
} | ||
} | ||
elsif ($NowA==-1 && $NowB==-1) | ||
{ | ||
print "bad Format,some thing wrong!!!\n"; | ||
} | ||
elsif ($NowA==-1 && $NowB!=-1) | ||
{ | ||
$Now=$NowB; | ||
} | ||
elsif ($NowA!=-1 && $NowB==-1) | ||
{ | ||
$Now=$NowA; | ||
} | ||
$Length=$Now-$Start; | ||
my $BBB=substr($Str,$Start,$Length); | ||
$BBB=sprintf ("%.1f",$BBB*100.0/$TotalRepeat); | ||
$Start=$Now; | ||
print OA "$AAA:$BBB" | ||
} | ||
} | ||
my $Length=length($Str); | ||
my $BBB=substr($Str,$Start,$Length); | ||
print OA "$BBB\n"; | ||
} | ||
$/="\n"; | ||
close IA; | ||
close OA ; | ||
|
||
######################swimming in the sky and flying in the sea ########################### |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/sh | ||
#$ -S /bin/sh | ||
#Version1.0 [email protected] 2017-06-13 | ||
echo Start Time : | ||
date | ||
../bin/VCF2Dis -InPut in.vcf.gz -OutPut p_dis.mat | ||
#../bin/VCF2Dis -InPut in.vcf.gz -OutPut p_dis.mat -SubPop sample.list | ||
echo End Time : | ||
date |
Binary file not shown.
Oops, something went wrong.