-
Notifications
You must be signed in to change notification settings - Fork 34
/
Copy pathchangeLog.txt
380 lines (337 loc) · 23.6 KB
/
changeLog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
############################################################################################################
# AnnotSV 3.4.4 #
# #
# An integrated tool for Structural Variations annotation #
# #
# Copyright (C) 2017-2024 Veronique Geoffroy ([email protected]) #
# #
############################################################################################################
For more details, please see the README file.
September, 05, 2024, AnnotSV version 3.4.4
- Temporary fix for issues #239 and #250 (wrong PRDM10 line in the GenCC file)
=> Code to be removed after updating annotations
August, 29, 2024, AnnotSV version 3.4.3
- New definition of alias gene symbols used in OMIM annotation files.
Keep only aliases found with the NCBI_gene_ID and validated with genomic locations
- Add a new check for the variantconvert configuration (issue 246)
- Fix a bug in the AnnotSV_ID setting when POS (e.g. 1) is contained in CHROM (e.g. 1)
May, 14, 2024, AnnotSV version 3.4.2
- Fix a bug when setting the cytoband annotation for large SV (issue 234)
- Update the mm39 repeat annotations (issue 225)
May, 03, 2024, AnnotSV version 3.4.1
- Update of the variantconvert distribution (2.0.1 installed)
Add of the "-variantconvertMode" option (to choose the variantconvert conversion mode)
- Mouse annotation update (Add GRCm39/mm39 annotations)
- Remove typos in the ranking explanation
- Check the user bedtools version
- Restricts the number of RE_gene features to 50, RE with "PhenoGenius specificity = "A" or Exomiser gene score > 0.7" are displayed first.
- Integrate all the gene symbols for OMIM morbid annotations
- Improve the creation of the benign dataset (gnomAD, 1000g and HPRC)
- Add log files for the PhenoGenius install
February, 16, 2024, AnnotSV version 3.4
- Human annotation update
- Add new regulatory elements annotations:
- Activity-by-Contact (ABC) model annotations (doi: 10.1038/s41586-021-03446-x)
- Massively Parallel Reporter assays (MPRA) annotations (doi: 10.1038/s41467-019-11526-w)
- Add novel benign SV annotation sources:
- from gnomad v4 (GRCh38) (issue 201)
- from the Human Pangenome Reference Consortium (HPRC) dataset (PACBIO long reads, GRCh38 only) (issue 202)
- from dbVar (NCBI’s database of genomic structural variation)
- Add new phenotype prioritization method: PhenoGenius (Yauy et al., 2023, doi: 10.1101/2022.07.29.22278181)
- Add 3 new features: PhenoGenius_score, PhenoGenius_phenotype and PhenoGenius_specificity
- Add 4 new features:
- Closest_left and Closest_right
AnnotSV expands the SV up to 5 megabases in both direction (left and right) and then tries to find neighboring genes
In both directions, the closest gene to the SV is reported
- NCBI_gene_ID
- Tx_version (transcript version)
- Gene names are now sorted by genomic coordinates and no longer sorted alphabetically ("Gene_name" feature)
- Fix a bug when setting the cytoband annotation (issue 210)
- Fix misleading OMIM annotation (issues 156, 132)
- Update of the Makefile
- Update of the "contact" recommendations in output (from email to GitHub)
- Reorganization of the GitHub .md files
January, 15, 2024, AnnotSV version 3.3.9
- Fix a bug when setting the "hpoVersion" variable (issue 207)
December, 22, 2023, AnnotSV version 3.3.8
- Add of 34 unit tests
November, 03, 2023, AnnotSV version 3.3.7
- Fix a domain error (issue 199)
- Include code to allow FranceGenRef benign SV annotation (856 WGS with ancestries in different regions of France)
WARNING: not supplied as part of the AnnotSV sources.
- Optimize access to the exomiser application properties file for use with singularity/bioconda
- Allow uppercase file extensions (".VCF", ".BED" or ".VCF.GZ")
- Use the new format of the COSMIC data source
- Update of the documentation in $ANNOTSV/docs/
- Update of the documentation to install AnnotSV human annotations in a local directory (INSTALL_annotations.sh)
- Optimize a regexp
- Fix variantconvert file name
June, 01, 2023, AnnotSV version 3.3.6
- Add bug fix concerning some INV bracketed. The second breakend notation (which is just the reciprocal of the first) is now always identified as the mate breakend.
- Sample IDs with missing alleles in the GT fields ("./." or ".|.") are now reported in the "Samples_ID" output field
- Refactor variantconvert code
- Allow description of REF/ALT values in SV input files with lowercase ACGT
- Update of the .gitignore file
- Add bug fix to report only a subset of the annotation columns provided by AnnotSV
- Add of warning messages for the ranking (in case of missing required annotations)
- Update the display of the default command line
April, 14, 2023, AnnotSV version 3.3.5
- Add documentation on the "-includeCI" option
- Add bugfix for variantconvert use in GRCh38 (when setting the REF in the VCF output file)
- Add bugfix during the install (depending on the environment)
April, 07, 2023, AnnotSV version 3.3.4
- Among the “po_P_*_*” features, redundancy is now removed ONLY from “po_P_*_phen” and “po_P_*_hpo”.
=> AnnotSV keeps now the correspondence between “po_P_*_source”, “po_P_*_coord” and “po_P_*_percent” features.
- Improvment of the running time (code part: SV partially overlapping with an established benign region)
- Update of the Makefile (variantconvert install)
- Remove requirement for ANNOTSV environment variable
- Add of the "INSTALL_code.sh" and "INSTALL_annotations.sh" bash files (for a basic manual installation)
- Add a check of the "#CHROM" header line
- Add of the "-variantconvertDir" option (path of the variantconvert directory). By default, the variantconvert tool distributed by annotSV is used.
March, 30, 2023, AnnotSV version 3.3.3
- Add bugfix for large input file
March, 28, 2023, AnnotSV version 3.3.2
- Add bugfix for SV partially overlapping with an established benign region. Ranking Impact
- Improve the 40 criteria (Loss) in the SV ranking
- Restrict the number of "po_B_*_someG_*" features to 20
March, 24, 2023, AnnotSV version 3.3.1
- Add bugfix with the use of the SVminSize option
- Report the minimal LOEUF value (among those of all overlapped genes) in the SV full length annotation
- Update of the variantconvert distribution (1.2.2 installed)
March, 23, 2023, AnnotSV version 3.3
- Add bugfix regarding genomic start coordinates. TSV and VCF output files are now both 1-based, with inclusive-end (whatever the input file format)
- Add interpretation of the square-bracketed SV breakend notations within the VCF.
This new module relies on the homogenization rules provided within the variant-extractor tool developed by Rodrigo Martín.
- Update of the variantconvert distribution (1.2.1 installed)
- Automation of the variantconvert module installaton
- Reformat of the unannotated.tsv file
- Add of the .gitignore file
- VCF format handling update. According to the VCF 4.4 specification, the SVTYPE has now been deprecated (due to redundancy with ALT).
The SV type is now extracted primarily from the ALT column, then from the SVTYPE field in the INFO column if available.
- Add bugfix for the TAD annotations update process
January, 31, 2023, AnnotSV version 3.2.3
- Report of the "Samples_ID" column in ALL outputs (default, sample = NA) (required by variantconvert)
- Add the creation of a VCF output file from a "BED" SV input file. The "-vcf" and "-svtBEDcol" options are required.
- In VCF output, the GT is set to “1/.” for each SV if the GT is not given in input (BED input file)
- Add bugfix concerning Mouse annotation
January, 06, 2023, AnnotSV version 3.2.2
- Add bugfix concerning the use of variantconvert to create the VCF output file (GRCh37/GRCh38, configfiles)
December, 21, 2022, AnnotSV version 3.2.1
- Add bugfix concerning the partial overlap with some specific dbVar SV (same start and end locations)
December 13, 2022, AnnotSV version 3.2
- Add of a new output format: VCF
- Add partial overlap annotation:
- Add 4 benign gain SV annotation: po_B_gain_allG_source, po_B_gain_allG_coord, po_B_gain_someG_source, po_B_gain_someG_coord
- Add 4 benign loss SV annotation: po_B_loss_allG_source, po_B_loss_allG_coord, po_B_loss_someG_source, po_B_loss_someG_coord
- Add 5 pathogenic gain SV annotation: po_P_gain_phen, po_P_gain_hpo, po_P_gain_source, po_P_gain_coord, po_P_gain_percent
- Add 5 pathogenic loss SV annotation: po_P_loss_phen, po_P_loss_hpo, po_P_loss_source, po_P_loss_coord, po_P_loss_percen
- Add the 2B, 2G and 4O criteria (Loss) in the SV ranking
- Add the 2B, 2C, 2F, 2G and 4O criteria (Gain) in the SV ranking
- Update of the "make uninstall"
September 12, 2022, AnnotSV version 3.1.3
- Human annotation update
- Use of the new ClinGen data source format ("OMIM ID" no longer reported)
- Removing of the DDG2P gene annotations (to avoid redundancy with GenCC)
- Documentation update
July 10, 2022, AnnotSV version 3.1.2
- Add important bugfix concerning the GRCh38 coordinates of the morbid genes
- Add bugfix concerning some Overlapped_CDS_percent values
November 25, 2021, AnnotSV version 3.1.1
- Add bugfix when setting the ANNOTSV global environmental variable with a final "/"
- Documentation update
November 08, 2021, AnnotSV version 3.1
- Change the -genomeBuild default value to "GRCh38" (instead of GRCh37)
- Use boolean values (instead of "yes"/"no") for the following option values: -candidateGenesFiltering, -includeCI, -overwrite, -reciprocal, -REreport, -REselect1 and -REselect2
- Add the Children’s Mercy Research Institute Benign SV annotations (n=502 WGS)
- Add the GenCC database for Gene-Disease relationship annotations
- Add CytoBand annotation
- Add novel regulatory element annotation:
- Add miRNA annotation (from miRTargetLink)
- Complete the RE_gene column output (the regulated gene name is detailed with more information: candidate gene annotation
and data sources (RefSeq, ENSEMBL, EnhancerAtlas, GeneHancer and/or miRTargetLink))
- Add the "-REselect1" and "-REselect2" options to filter the RE_gene output
- Update the benign SV annotation method:
- Add the "-benignAF" option to change the allele frequency threshold used to select the benign SV in the data sources
- Add 4 annotation columns: B_gain_AFmax, B_loss_AFmax, B_ins_AFmax and B_inv_AFmax (maximum allele frequency of the reported benign genomic regions)
- Add new details in the"AnnotSV_ranking_criteria" output column: (i) detailed scoring points and (ii) remove gene names redundancy
- Add new warnings if the compound heterozygosity analysis is not processed
- Add the "-version" option
- Take into account of a new format in the downloaded OMIM data (approved gene symbol)
- Add bugfix in case of leading or trailing white space in SV type values from the SV input BED file
- Add bugfix in case of no external BED annotation files used
- External BED annotation files can now also be used to report any feature overlapped with the SV (even with 1bp overlap)
- Add bugfix concerning the use of the "-candidateGenesFiltering" option
- Include "NA" in the "-rankFiltering" default option (default = "1-5,NA")
- Add bugfix concerning the use of the "split" annotationMode
- Add bugfix in the SV input BED file, last column could not have empty values. Replaced with "." if empty
- Add bugfix in section 5 of the SV ranking
- Set the ACMG_class to "NA" if not defined
- Update/Add Mouse annotations (CytoBand, miRNA from miRTargetLink)
December 18, 2020, AnnotSV version 3.0
- Major code rewrite and annotations sources reorganization
- Significant modification of the annotations column names
- New SV ranking based on the ACMG guidelines (Riggs et al 2020) as a replacement of the previous ranking (v2.5).
- Add 3 annotation columns: AnnotSV ranking score; ranking decision criteria; AnnotSV ranking class
- Merge pathogenic SV annotation (from multiple sources)
- Pathogenic SV sources: dbVar, ClinGen, ClinVar, OMIM morbid genes
- Add 12 annotation columns:
P_gain_phen; P_gain_hpo; P_gain_source; P_gain_coord;
P_loss_phen; P_loss_hpo; P_loss_source; P_loss_coord;
P_ins_phen; P_ins_hpo; P_ins_source; P_ins_coord;
P_inv_phen; P_inv_hpo; P_inv_source; P_inv_coord
- Remove previous annotation columns from dbVar
- Merge benign SV annotation (from multiple sources)
- Benign SV sources: DGV, ClinVar, ClinGen, DDD, gnomAD, 1000g and IMH
- Add 8 annotation columns:
B_gain_source; B_gain_coord; B_loss_source; B_loss_coord;
B_ins_source; B_ins_coord; B_inv_source; B_inv_coord
- Remove previous annotation columns from DDD, DGV, gnomAD, 1000g and IMH
- Add pathogenic SNV/indel annotation (from ClinVar)
- Add new regulatory elements annotation (EnhancerAtlas)
- Merge regulatory elements annotations into a single column (RefSeq/ENSEMBL, EnhancerAtlas, GeneHancer)
- Update of the annotation data sources with the latest available versions
- The "overlap" option default is now set to 100 % in order to be compliant with the ACMG guidelines
- Add the percent of the CDS overlapped with the SV (in the "split" annotation lines)
- Add the number of overlapped genes in the "full" annotation lines
- By default, AnnotSV now expands the "start" and "end" SV positions with the VCF confidence intervals (CIPOS, CIEND) around the breakpoints (see the "includeCI" option)
November 06, 2020, AnnotSV version 2.5.2
- Add the pLI annotation from gnomAD (pLI_gnomAD), update the pLI annotation from ExAC (pLI_ExAC)
- Add the "LOEUF_bin" annotation (gnomAD)
- Add the "tx start", "tx end" and "Number of exons" annotations
The "tx length" column has been renamed "overlapped tx length"
The "CDS length" column has been renamed "overlapped CDS length"
- If not provided, the EXOMISER_GENE_PHENO_SCORE is set to "-1.0"
- Update README.md
October 13, 2020: AnnotSV version 2.5.1
- fix: Add bugfix when the "SV type" is badly formatted
October 12, 2020: AnnotSV version 2.5
- Add of the "ENCODE blacklist", "Segmental Duplication" and "Gap" annotation datasets
- Fix a critical bug for DGV annotation (GRCh38)
- Add the distance / type to the nearest splice site after considering both breakpoints (distNearestSS and nearestSStype columns)
- Add the in-frame / out-of-frame information from overlapping genes (frameshift column)
- Add decision criteria explaining the ranking (previously available as a separate file)
- Remove of the ranking decision output file (*.ranking.tsv)
- Change the names of the values for the "tx" option:
NM >> RefSeq
ENST >> ENSEMBL
- Add bugfix for exomiser use (don't use some badly formatted NCBI gene ID)
- Add bugfix allowing to use a configfile located in the same directory as the input file
July 30, 2020: AnnotSV version 2.4
- Update of the annotations sources (see the corresponding README section)
- Add of the COSMIC SV dataset (Cancer)
- AnnotSV now reports either RefSeq or Ensembl gene transcripts. Use the new "-tx" option to report either NM or ENST transcripts
=> The « NM » column has been renamed « tx »
=> The “RefGene” directory has been renamed "Genes"
- Add of the "Samples_ID" feature (report of the sample names for which the SV was called)
=> Can be disabled in the AnnotSV configfile
- Include the new "-externalGeneFiles" option to pass external gene file(s) path in the command line
- Integration of 4 Tcl packages (http/tar/csv/json) in the AnnotSV distribution
- Use of the “bcftools” toolset (Li, 2011) to fix a bug with multiallelic sites from VCF input file(s)
=> bcftools is now required if using VCF input file(s)
- Fix the output columns order not being the same depending on the system environment
- "1000g_AF" and "1000g_max_AF" features are not reported anymore
- Add bugfix concerning the CDS length and tx length calculation
- Add bugfix concerning annotation of 2 SV with the same coordinates but from different types (DEL, DUP...)
- Add bugfix with gzipped VCF files as input
- Add bugfix concerning the running of bash scripts (illegal use of | or |& in command)
- Add bugfix concerning the use of the "-snvIndelFiles" and "-candidateSnvIndelSamples" options
- Add bugfix to the Exomiser module
- Add bugfix concerning the AnnotSV installation when PREFIX is not the current directory
- Add bugfix concerning the use of a big "candidateGenesFile"
Dec 20, 2019: AnnotSV version 2.3
- Include phenotype-driven annotations (HPO), based on Exomiser (Smedley et al., 2015)
- Include the lift-over GRCh38 gnomAD SV frequency annotation
- Include the "-annotationsDir" option to pass the annotations directory to AnnotSV at run time
- New "AnnotSV ID" settings (to ensure unique SV identifiers)
- Deletion filtering improvement
- Integration of the gnomAD frequency data in the ranking
- AnnotSV can now create two other output files:
- A report of unannotated variants (e.g. badly formatted SV, variant length < SVminSize...)
- A report of the decisions that explain the ranking of each SV
- Change the names of the misleading following options:
vcfFiles >> snvIndelFiles
vcfPASS >> snvIndelPASS
vcfSamples >> snvIndelSamples
filteredVCFfiles >> candidateSnvIndelFiles
filteredVCFsamples >> candidateSnvIndelSamples
- Annotations are not distributed anymore with the sources but downloaded during the installation with the Makefile
- AnnotSV executable is now directly located in $ANNOTSV/bin to respect the FHS
- Add bugfix concerning the management of BED files
- Add bugfix for the report of the compound heterozygosity (1 SV + 1 SNV/indel)
- Add bugfix concerning the -candidateGenesFiltering option
- Add bugfix concerning the DGV metrics
- Add bugfix concerning the use of the "-sort" Linux command (whose behavior is OS dependant)
- Add bugfix concerning the use of "external gene annotation files"
- Add bugfix concerning the -txFile option
July 09, 2019: AnnotSV version 2.2
- AnnotSV follows now the Filesystem Hierarchy Standard (FHS). Installation can be easily done using a Makefile
- Include 2 new options: "-candidateGenesFiltering" to select the SV overlapping a gene from the "candidateGenesFile" (default = no)
"-rankFiltering" to select the SV of a user-defined specific class (from 1 to 5), default = "1-5"
- Users can now disable default annotation (through a configfile) provided by AnnotSV and only have user defined annotations
- AnnotSV is now available for the Mouse genome SV annotations
- Add the UTR/CDS's information from overlapping genes (location2 column)
- Add bugfix concerning the use of the "-candidateGenesFile" and "-reciprocal" options
- Add bugfix concerning the report of the SV length
- Add bugfix concerning the report of gene-based annotation on the full lines
Apr 18, 2019: AnnotSV version 2.1
- Include the gnomAD SV frequency annotation
- Include the Ira M. Hall’s lab SV frequency annotations
- Include GeneHancer annotation (an integrated compendium of human promoters, enhancers and their inferred target genes)
WARNING: not supplied as part of the AnnotSV sources. Users need to request the up-to-date GeneHancer data dedicated to AnnotSV
- Include 2 new options: "-overwrite" to overwrite existing output results (default = yes)
"-txFile" to specify a list of preferred genes transcripts to be used in priority during the annotation
- Default of the -SVinputInfo option is now set to 1 (the additional fields from the SV input file are reported in the outputfile)
- Large bed annotation files are presorted, in order to be compatible for server with low specifications
- Improve error messages and exit management (return a non-zero exit code in case of error or zero if all went fine)
- AnnotSV minimum requirement is now starting with Tcl 8.5
- Add bugfix concerning the homozygous and heterozygous SNV/indel counts within the SV to annotate
- Add bugfix for the SV ranking (when the -metrics option was set to "fr")
- Add bugfix concerning the "-reciprocal" option
Dec 21, 2018: AnnotSV version 2.0
- Add ranking/classification for SV in 5 classes (from benign to pathogenic)
- Include 12 additional annotations including:
-> the creation of a unique identifier for each SV
-> the SV length
-> the SV type (DEL, DUP, ...)
-> the SV ranking/classification
-> the OMIM morbid genes
-> the ClinGen Haploinsufficiency Score
-> the ClinGen Triplosensitivity Score
-> the ACMG genes
-> the CNV intolerance from ExAC
- Add of the "metrics" option to change numerical values to us or fr metrics (e.g. 0.2 or 0,2)
- Add bugfix concerning empty SV input file: return a non-zero exit status (1) to continue processing in a pipeline
- Modification of the directories structure for the annotation. Please look at the README file.
- Options: "SVfromDBoverlap", "FeaturesOverlap" and "SVtoAnnOverlap" have been replaced by "reciprocal" and "overlap"
- By default, AnnotSV now reports the additional fields from the SV BED input file
- Report of the input BED file header in the output
- Update of all annotation sources provided with AnnotSV
Sep 28, 2018: AnnotSV version 1.2
- Support the integration of user defined annotated regions imported from BED and/or TSV file(s) into the output file
- Include 3 additional annotations based on the dbVar pathogenic NR SV dataset:
-> The dbVar NR SV event types (e.g. deletion, duplication…)
-> The dbVar NR SV accession (e.g. nssv1415016)
-> The dbVar NR SV clinical assertion (e.g. pathogenic, likely pathogenic)
- Include 1 new option (-typeOfAnnotation) to configure the types of lines produced by AnnotSV (both, full or split)
- OutputFile extension will always be a “.tsv” (tab separated values) extension
- Add bugfix concerning large SV and TAD boundaries annotation
May 16, 2018: AnnotSV version 1.1.1
- Add bugfix concerning 1000g annotation (in some cases, an insertion could be not reported and AnnotSV stopped working properly)
Mar 20, 2018: AnnotSV version 1.1
- Add bugfix concerning counts of the homozygous and heterozygous variants in VCF files
- Support for new SV input file format: VCF file (4.3) can now be used to describe the SV to annotate (in addition to the BED format)
The "-bedFile" option has now been renamed "-SVinputFile"
The "-bedInfo" option has now been renamed "-SVinputInfo". Default is now set to 0 (the additional fields from the SV input file are not reported in the outputfile)
- Report additional information while counting variants in the SNV/indel input file(s):
-> The number of SNV/indel loaded
-> The number of SNV/indel not considered because of the “FILTER” column value
-> The number of SNV/indel not considered because of the absence of genotype information (“GT” value can be absent in bad VCF formatted files)
-> The number of SV present but not considered for that purpose (only SNV/indel are taken into account)
- Set the default value of the -vcfPASS option to 0 (to be non-restrictive and consider all variants in the VCF by default).
- Include 2 new options (-outputDir and -outputFile) to specify the output directory and file name
- Include 3 additional annotations based on the 1000 genomes phase 3 dataset:
-> The type of event (i.e. DEL, ALU, DUP, <CN3>...)
-> The global allele frequency
-> The maximum observed allele frequency across populations
- Include a new option (-SVminSize) to set the SV minimum size (in bp). Default is 50 (bp)
Dec 21, 2017: AnnotSV version 1.0