ClassifyCNV is a command-line tool that implements the 2019 ACMG guidelines to evaluate the pathogenicity of germline duplications and deletions.
The tool uses pre-parsed publicly available databases to calculate a pathogenicity score for each copy-number variant (CNV) in accordance with the ACMG guidelines.
ClassifyCNV runs on UNIX, Linux and MacOS. All of the necessary databases are included in the repository.
Python 3.6 or above and BEDTools version 2.27.1 or above must be installed.
ClassifyCNV accepts a BED file as input. The file must include the following columns in this order:
- chromosome
- CNV start position
- CNV end position
- CNV type (DEL or DUP)
CNVs on alternative contigs are not evaluated. Both hg19 and hg38 coordinates are supported.
Before running ClassifyCNV it is recommended that the ClinGen files are updated by executing
update_clingen.sh
Command to run ClassifyCNV:
python3 ClassifyCNV.py --infile YourCNVFile.bed --GenomeBuild {hg19,hg38}
Optional parameters:
--cores
: number of threads to use; default is 1
--precise
: should be used only if the exact CNV breakpoints are known; if this flag is on, the script will evaluate the effect of intragenic CNVs
--outdir
: specify the name of a run-specific directory where the results will be saved to; the directory will be created inside the ClassifyCNV_results folder
Sample datasets and a sample scoresheet are included in the Examples folder.
Run ClassifyCNV on a sample dataset:
python3 ClassifyCNV.py --infile Examples/ACMG_examples.hg19.bed --GenomeBuild hg19 --precise
Results are saved to the run-specific folder inside the ClassifyCNV_results
folder.
The run-specific folder is named Result_dd_Mon_yyyy-hh-mm-ss
, unless the user specified a different folder name.
The filled out table with the scores and the final classification
of each CNV is named Scoresheet.txt. The column names correspond
to the evidence fields of the ACMG rubrics available here:
http://cnvcalc.clinicalgenome.org/cnvcalc/cnv-loss
http://cnvcalc.clinicalgenome.org/cnvcalc/cnv-gain
The numeric pathogenicity score, calculated by ClassifyCNV, is converted to pathogenicity classification using the following cutoffs:
≤ −0.99
: benign variant
−0.90 .. −0.98
: likely benign variant
−0.89 .. 0.89
: variant of uncertain significance
0.90 .. 0.98
: likely pathogenic variant
≥ 0.99
: pathogenic variant
The last two columns of the file include a list of dosage-sensitive genes contained within the CNV and a list of all protein coding genes in the CNV.
If you use the software, please cite:
Gurbich, T.A., Ilinsky, V.V. ClassifyCNV: a tool for clinical annotation of copy-number variants. Sci Rep 10, 20375 (2020). https://doi.org/10.1038/s41598-020-76425-3
The software provided herein is free for academic instruction and research use only. Commercial licenses are available to legal entities, including companies and organizations (both for-profit and non-profit), requiring the software for general commercial use. To obtain a commercial license please, contact us via e-mail: [email protected].
This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. The code should not be modified and/or redistributed without the permission of the authors.
Tatiana Gurbich (Genotek Ltd)
Valery Ilinsky (Genotek Ltd)