Releases: OCR-D/ocrd_tesserocr
Releases · OCR-D/ocrd_tesserocr
v0.7.0
Added:
- segment-table: new processor that adds table cells as text regions, #104
raw_lines
option, #104- interprete
overwrite_regions
more consistently, #104 - annotate
@orientation
(independent of dedicated deskewing processor) for vertical and@type
for all other text blocks, #104 - no separators and noise regions in reading order, #104
Changed:
v0.6.0
v0.5.1
v0.4.1
- Adapt to feature selection/filtering mechanism for derived images in core
- Fixes for image-feature-related corner cases in crop and deskew
- Use explicit (second) output fileGrp when producing derived images
- Upgrade to upstream tesserocr 2.4.1
- Use OCR core >= stable 1.0.0
v0.5.0
v0.4.0
v0.3.0: implement AlternativeImage-based processing:
Changed:
- Use basename of input file for output name
- Use .xml filename extension for PAGE output
- Warn about existing border or regions in
crop
- Use
PSM.SPARSE_TEXT
without tables incrop
- Filter unreliable regions in
crop
- Add padding around border in
crop
- Delete existing regions in
segment_region
- Cover vertical text and tables in
segment_region
- Add parameter
find_tables
insegment_region
- Add parameter
crop_polygons
insegment_region
- Add parameter
overwrite_regions
insegment_region
- Add parameter
overwrite_lines
insegment_line
- Add parameter
overwrite_words
insegment_word
- Add page/region-level processor
deskew
- Add page/region/line-level processor
binarize
- Respect AlternativeImage on all levels
v0.2.2
Changed:
- Add simple page cropping processor crop
- Respect border cropping in segment_word
- Add parameter overwrite_words in recognize
- Make higher TextEquivs consistent after recognize
Fixed:
- Remove invalid @externalRef from MetadataItem
- Retain pageId in output (i.e. link to structMap)