forked from vidjil/vidjil
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGELOG
228 lines (201 loc) · 13.9 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
This changelog concerns the algorithmic part (C++) of Vidjil.
2016-09-30 The Vidjil Team
* New default trim option (-t 0) for germlines, leading to better results on some special recombinations (vidjil.cpp)
* More flexible option to keep sequences of intereset (-W), accepting sequences of any size (core/windows.cpp)
* Better generation of consensus sequences, marking ambiguous positions with Ns (core/representative.cpp)
* Better recognition of standard recombinations by lowering incomplete germline attractiveness (core/segment.cpp)
* Better TRD locus analysis, including some Dd-Dd-Jd recombinations (data/germlines.data)
* Reduced unused similarity section of .json file output (vidjil.cpp)
* New json output with -c segment (vidjil.cpp)
* Updated help (doc/algo.org)
* New and updated unit and functional tests, updated test process
2016-08-06 The Vidjil Team
* New computation of average quality for each representative sequence, inside its window (core/representative.cpp)
* New option (-E) to set the e-value for FineSegmenter D segment detection (vidjil.cpp)
* Safer e-value estimation for D segments (vidjil.cpp)
* Raised capacity to 2*10^9 reads (core/segment.cpp, core/fasta.cpp)
* Better filtering/debug options (-u/-uu/-uuu), keeping interesting reads (vidjil.cpp, core/windowExtractor.cpp)
* Better .vdj.fa header for irregular/incomplete recombinations, 1-based positions (core/segment.cpp)
* Bugs closed (-u files, similarity in .vidjil)
* New and updated unit and functional tests, checking again .should-vdj tests
2016-07-13 The Vidjil Team
* New json 2016b format, renamed fields 'seg.{5,4,3,evalue}' and 'diversity', 1-based positions (core/segment.cpp)
* New tool to display and debug alignments between a read and selected V/J genes (tools/vdj_assign.cpp)
* Better test structure and process. The should-vdj tests will be soon moved to their own repository.
* Bugs closed (build process, affine gaps (core/dynprog.cpp), JUNCTION/CDR3 detection (core/segment.cpp))
* New and updated functional tests
2016-03-04 The Vidjil Team
* Better JUNCTION/CDR3 detection (-3), based on positions of Cys104 and Phe118/Trp118 (core/segment.cpp)
* Bugs closed (.vidjil output of revcomp'd sequences, computation of Simpson index Ds)
* Streamlined structure of tests directory
* New and updated unit and functional tests
2016-02-08 The Vidjil Team
* New default threshold (-z 100), fully analyzing more clones
* New computation of diversity indices (Shannon, Simpson) (core/windows.cpp)
* Streamlined KmerSegmenter test (core/affectanalyser.cpp)
* Safer p-value estimation in KmerSegmenter, taking into account the central region (core/affectanalyser.cpp)
* New experimental VDDJ detection analysis (-d) (core/segment.cpp)
* Better FineSegmenter V(D)J designation for inverted genes in unexpected recombination analysis (core/segment.cpp)
The unexpected recombination analysis (-2) can now safely be used.
* Streamlined FineSegmenter handling of V, D and J segments (core/segment.cpp)
* Faster FineSegmenter (~30%), relying on the KmerSegmenter to select to correct strand (core/segment.cpp)
* Updated germline genes from IMGT/GENE-DB
* Bugs closed (.vidjil output for sequences beyond the -z threshold)
* New and updated unit and functional tests
2015-12-22 The Vidjil Team
* Better KmerSegmenter, rejecting reads with large alterning V/J zones (core/segment.cpp)
* Better unsegmentation causes, 'UNSEG only V/J' needs significant V/J fragments (core/segment.cpp)
* Renamed unsegmentation messages to "only V/5'" and "only J/3'" to avoid confusion (core/segment.h)
* Better unexpected recombination analysis (-2), with a FineSegmenter V(D)J assignation (core/segment.cpp)
* Faster FineSegmenter (~35%), computing roughly half of the dynamic programming matrix (core/segment.cpp)
* More flexible handling of badly formatted .fastq files (core/fasta.cpp)
* New filtering/debug option (-uu), sort unsegmented reads (core/windowExtractor.cpp)
* Streamlined json output, removing short codes (vidjil.cpp, core/segment.cpp)
* Updated and refactored help (doc/algo.org)
* Updated unit and functional tests
* Bugs closed (build process, -X with large numbers, -x/-X combined with -c segment, stdout messages)
2015-10-08 The Vidjil Team
* Bug closed (V(D)J assignation when V/D/J segments are close on the negative strand) (core/segment.cpp)
2015-10-05 The Vidjil Team
* Better FineSegmenter V(D)J assignation, especially when V/D/J segments are close (core/segment.cpp)
* Better e-value computation for FineSegmenter V(D)J assignation (core/segment.cpp)
* Renamed 'UNSEG too few J/V' to 'UNSEG only V/J' to better reflect the actual detection of one part
* Refactored tests for V(D)J assignation, allowing flexible patterns, new documentation (doc/should-vdj.org)
* New tests on sequences with manually curated V(D)J assignations (tests/should-vdj-tests)
* Bugs closed (no more spurious D in some VJ recombinations, corrected number of deletions around D regions)
2015-07-21 The Vidjil Team
* New flexible parameterization of analyzed recombinations through a json file (germline/germlines.data)
* New experimental unexpected recombination analysis (-4 -e 10) (core/segment.cpp)
* New threshold for FineSegmenter VJ assignation, with at least 10 matches (core/germline.cpp)
* Streamlined handling of segmentation methods (core/germline.h, core/segment.cpp)
* Updated distance matrix computation between all clones (core/similarityMatrix.cpp)
* New nlohmann json libray to parse and write json files (lib/json.h)
* Updated build process, now requiring a C++11 compiler
* New draft developer documentation (doc/dev.org), updated help and user documentation
* New and updated unit and functional tests, bugs closed in shouldvdj tests
2015-06-05 The Vidjil Team
* New default trim option (-t 100), considering only the relevant ends of the germline genes
* Better segmentation heuristic when there are few k-mers (core/segment.cpp)
* Better TRA/TRD locus analysis (-i), including both Vd-(Dd)-Ja and Dd-Ja recombinations (core/germline.cpp)
* Better incomplete TRD and Dh/Jh analysis (-i), including up/downstream region of D genes (core/germline.cpp)
* Better computation of the reference length for the coverage information, considering all reads of each clone
* Better unexpected recombination analysis (-2), with information on the locus used (core/segment.cpp)
* Streamlined again unsegmentation causes and removed delta_{min,max} for the heuristic (core/segment.cpp)
* Refactored stats computation (core/read_storage.cpp)
* Updated build process
The next release will require a C++11 compiler. Static binaries will also be distributed.
* Updated help (unsegmentation causes, clustering options, -m option)
* Bugs closed (segmentation with k-mers on the negative strand, e-value computation, clone output on stdout)
* New and updated unit and functional tests, and faster functional tests
2015-05-08 The Vidjil Team
* New default e-value threshold (-e 1.0), improving the segmentation heuristic
* New default for window length (-w 50), even with -D or with -g, streamlining the window handling
* Better multi-germline analysis (-g), selecting the best locus on the e-value (core/segment.cpp)
* New experimental trim option (-t), considering only the relevant ends of the germline genes (core/kmerstore.h)
* Streamlined unsegmentation causes, including 'too short for w(indow)'
* Updated main and debug ouptut
* Updated help (algo.org, and new locus.org)
* New option to keep only reads with labeled windows (-F), new combo (-FaW) to filter reads by window
* Bugs closed (non symmetrical seeds and revcomp)
* New and updated unit and functional tests
2015-04-09 The Vidjil Team
* New experimental e-value threshold (-e) (core/segment.cpp)
* New experimental unexpected recombination analysis (-2) (core/segment.cpp)
* New preview/debug options to stop after a given number of reads (-x), possibly sampled throughout the file (-X)
* New preview/debug option to output unsegmented reads as clones (-!)
* New progress bar during computation
* Better memory management for the reads taken into account for the representative (core/read_storage.cpp)
* Updated .json output, with k-mer affectation results
* Updated .json output, with the 'coverage' of the representative (core/representative.cpp)
* Removed unused code parts as well as some files
* Updated help, separing basic (-h) and advanced/experimental options (-H)
* Bugs closed (extended nucleotides and revcomp)
* New and updated unit and functional tests
2015-03-04 The Vidjil Team
* Better multi-germline analysis (-g), returning the best locus for each read (core/segment.cpp)
The incomplete rearrangement analysis (-i) can now safely be used.
The speed of this multi-germline analysis will be improved in a next release.
* Faster representative computation (core/read_chooser.cpp)
* Included tools (tools/*.py) to process .vidjil files
* New statistics on the number of clones (core/germline.cpp)
* New experimental CDR3 detection (-3). We still advise to use IMGT/V-QUEST for better and complete results.
* Better debug option -K (.affect), especially in the case of multi-germline analysis
* Refactored dynamic programming computations (core/dynprog.cpp), experimental affine gaps
* Removed unused code parts as well as some files
* Streamlined flag processing in Makefiles
* New mechanism for some functional tests (make shouldvdj)
* New experimental tests from generated recombinations (make shouldvdj_generated)
* New and updated unit and functional tests
2015-01-31 The Vidjil Team
* Better TRG and TRD+ parameters (-g). The multi-germline analysis will again be improved in a next release.
* New experimental option (-I) to discard common kmers between different germlines (core/germline.cpp)
* Updated outputs for better traceability (version in .json, germlines on stdout)
* New mechanism to retrieve germline databases (germline/get-saved-germline)
2014-12-22 The Vidjil Team
* Better multi-germline analysis (-g). This will again be improved in a next release.
* New experimental incomplete rearrangement analysis (-i)
* New and updated unit and functional tests
* Bugs closed (-w 40 when no D germline)
2014-11-28 The Vidjil Team
* New input method, now accepts compressed fasta files with gzip (core/fasta.cpp, gzstream/zlib)
* Better multi-germline analysis (-g) and documentation. This analysis can now safely be used.
* Streamlined input. Option -d is removed, and a germline is required (-V/(-D)/-J, or -G, or -g)
* Removed unused code parts as well as some files
* New and updated unit and functional tests - now more than 80% code coverage
* New public continuous integration - travis, coveralls
* Bugs closed (-l, large -r)
2014-10-22 The Vidjil Team
* Streamlined filtering options (-r/-y/-z), better documented (doc/algo.org)
* Streamlined output files, option to fix their basename (-b)
* Updated .data .json output, now in the better documented 2014.10 format (doc/format-analysis.org)
* New experimental multi-germline analysis (-g). This will be improved and documented in a next release.
* Faster FineSegmenter with a better memory allocation (core/dynprog.cpp)
* Refactored main vidjil.cpp, objects storing germlines and statistics (core/germline.cpp, core/stats.cpp)
* Transferred clustering from clone output to information in .data, again simplifying vidjil.cpp
* Removed unused code parts as well as some files
* New and updated unit and functional tests
* Bugs closed
2014-09-23 The Vidjil Team
* Export cause of non-segmentation in the .data
* New option to output segmented reads (-U), now by default segmented reads are not output one by one
* Updated .data .json output (the format will change again in a next release)
* Updated tests
2014-07-28 The Vidjil Team
* Better heuristic, segment more reads (core/affectanalyser.h, core/segment.cpp)
This improved heuristic was designed to implement a multi-germline analysis in a next release.
* Improved computation of the heuristic affectation. Halves the time of -c windows (core/kmerstore.h)
* New command '-c germlines', discovering germlines (vidjil.cpp)
* New unit tests, updated some tests
* Updated .json output (experimental distance graph)
* Bugs closed
2014-03-27 The Vidjil Team
* Better default seed selection, depending on the germline, segments more reads (vidjil.cpp)
* Better selection of representative read (core/representative.cpp)
* New option to output all clones (-A), for testing purposes
* Updated debug option (-u) to display k-mer affection (core/windowExtractor.cpp)
* New unit tests, updated some tests
* Improved management of dependencies (Makefile)
* Improved documentation and comments on main stdout
2014-02-20 The Vidjil Team
* Refactored main vidjil.cpp (core/windows.cpp, core/windowExtractor.cpp)
* Removed unused html output
* Better json output (core/json.cpp)
* Updated main stdout, with representative sequence for each clone
* Updated parameters for FineSegmenter (delta_max) and dynprog (substition cost)
* Bugs closed
2013-10-07 The Vidjil Team
* Better heuristic, segments more reads (core/segment.cpp)
* Better and faster selection of representative read (vidjil.cpp, core/read_chooser.cpp)
* Better display of reason of non-segmenting reads
* New normalization against a standard (-Z) (core/labels.cpp)
* New experimental lazy_msa multiple aligner
* New .json output
* New unit tests
* Bugs closed
2013-07-03 The Vidjil Team
* New selection of representative read (core/read_chooser.cpp)
* Faster spaced seed computation (core/tools.cpp)
* New unit tests
* Bugs closed
2013-04-18 The Vidjil Team
* First public release