A simple pre-project in python with the handwritten text segmentation module in c++.
- Python 3.7
- openCV 3+
python main.py -c -p
or
python3 main.py -c -p
Specify an image
python main.py -c -p --image xxx.png
or
python3 main.py -c -p --image xxx.png
- Document Scanner
- Binarization with illumination compensation
- Line Segmentation with deslanting
- Word Segmentation
Process of detecting the predominant contour in the image and segment using a four-point transformation. [ref]
A technique for light compensation and sauvola binarization was applied, but others techniques was studied also.
- Implementation of the paper "Efficient Illumination Compensation Techniques for text images", Guillaume Lazzara and Thierry Géraud, 2014. [ref]
- Niblack, Sauvola and Wolf binarizations. [ref]
-
Implementation of the paper "A Statistical approach to line segmentation in handwritten documents", Manivannan Arivazhagan, Harish Srinivasan and Sargur Srihari, 2007. [ref]
-
Deslanting image. [ref]
- Implementation of the paper "Scale Space Technique for Word Segmentation in Handwritten Documents", R. Manmatha and N. Srimal, 1999. [ref]
Binary image
Image lines
First line/words segment