Snakemake workflow for testing μ-PBWT against Durbin's PBWT and Syllable-PBWT on 1000 Genome Project (1KGP) phase 3 data. 1KGP data are available at this link.
Snakemake need to be already installed, for example via conda:
conda create -c conda-forge -c bioconda -n snakemake snakemake
cd muPBWT-1KGP-workflow
snakemake --cores <num_cores> --use-conda --resources load=100
The option --resources load=100
option will avoid using too much RAM for Durbin Algorithm 5 (about 500gb is still needed), running only one job at a time for the rule runPbwtIndexed
(thanks Jan Schreiber).
The pipeline will generatecd some results:
- in
results/data
some useful CSV files - in
results/plots
some plots in PDF format - in
results/tables
some tables in LaTeX syntax