GitHub - JasonAHendry/multiply: Multiplex PCR design, in silico

Multiplex PCR design, in silico

Overview

multiply is a command-line tool enabling the design of multiplexed PCRs for a user-specified set of target genes and/or regions. It works by first producing a set of candidate primers for each target using primer3 (multiply generate). It then computes the number of SNPs in each primer (multiply snpcheck); potential dimers betweeen pairs of primers (multiply align); and potential mispriming and off-target amplicons for each primer (multiply blast). Information from these three quality control steps is passed to a cost function, which is minimised by brute-force or with a greedy search algorithm (multiply select).

The pipeline is summarised below:

Install

First, clone the repository to your local machine:

git clone https://github.com/JasonAHendry/multiply

Then, install the software dependencies using conda:

cd multiply
conda update conda
conda env create -f environments/run.yml
conda activate multiply-run

Finally, install multiply itself with pip:

pip install -e .

Test installation by running:

multiply

Basic usage

Part 1: Download information about your target organism.

To generate a new multiplex PCR, you first need to download the reference genome (FASTA) and information about gene locations (GFF) for your target organism. To see what organisms are available for download $^1$, type:

multiply download --available

Organisms are specified by a GenusSpecies keyword; e.g. PlasmodiumFalciparum or AnophelesGambiae. For example, one would download information about Plasmodium falciparum by running:

multiply download -g PlasmodiumFalciparum

Part 2: Create a multiplex design file.

The next step is to specify your target genes and/or region(s) with a design file. Examples of design files can be seen in the /designs folder. Genes are specified by a comma-separated list of gene identifiers (the target_ids field in the design file). Regions are specified by a separate BED file, with a fourth column that gives a unique name to each region (the bed field).

Candidate primers are generated using primer3. A collection of primer3 settings are available as JSON files in the folder settings/primer3$^2$. Multiple primer3 settings can be used in a single run of multiply by passing them as a comma-separated list to the primer3_settings field of the design file.

Part 3: Run.

After your target organism is downloaded and your design file is prepared, you can run the complete multiply pipeline with the following command:

multiply pipeline -d designs/<your-design-file.ini>

For multiplexes of moderate size (e.g. <20 targets), running the pipeline will typically take a few minutes. A directory containing results will be produced in the results directory, within sub-folder whose name is specified in your design file (the name field).

Footnotes

$1$ New organisms can be made available for download by adding them to the collection file located at genomes/collection.ini. Any organism available from PlasmoDB, EnsemblGenomes or RefSeq Genome can be added to the collection.

$2$ You can make your own primer3 settings by creating new or alterating existing JSON files in the settings/primer3 folder.

Resources

multiply uses the following external software and databases:

primer3. Individual primer pair design. https://primer3.org/
bedtools. Genome arithmetic. https://bedtools.readthedocs.io/en/latest/
blastn. Local alignment search. https://blast.ncbi.nlm.nih.gov/Blast.cgi
PlasmoDB. Plasmodium reference genome. http://plasmodb.org/plasmo/
MalariaGEN. Plasmodium genetic diversity. https://www.malariagen.net/data
EnsemblGenomes. Additional reference genomes. https://ensemblgenomes.org/
RefSeq Genome. Additional reference genomes. https://www.ncbi.nlm.nih.gov/genome/

Primer dimer detection uses an alignment algorithm similar to the one described by Johnston et al. (2019).

Citation

We have a preprint available on bioRxiv.

Acknowledgements

My thanks to Nada Kubikova, who gave helpful advice on primer design for multiplex PCR; and to Dan J. Bridges, Gavin Band, Mulenga Mwenda, and Annie Forster, who tested various versions multiply.

This work was funded by the Bill and Melinda Gates Foundation (INV-003660).

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
.images		.images
designs		designs
environments		environments
genomes		genomes
results		results
settings		settings
src/multiply		src/multiply
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Install

Basic usage

Part 1: Download information about your target organism.

Part 2: Create a multiplex design file.

Part 3: Run.

Footnotes

Resources

Citation

Acknowledgements

About

Releases

Packages

Languages

License

JasonAHendry/multiply

Folders and files

Latest commit

History

Repository files navigation

Overview

Install

Basic usage

Part 1: Download information about your target organism.

Part 2: Create a multiplex design file.

Part 3: Run.

Footnotes

Resources

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages