@article{korobacz2018preprocessing,
author = {Korobacz, Wojciech and Tabędzki, Marek},
title = {{Preprocessing photos of receipts for recognition}},
journal = {Advances in Computer Science Research},
number = {14},
year = {2018},
issn = {2300-715X},
doi = {10.24427/acsr-2018-vol14-0006}
}
Receipt detection | Receipt localization | Receipt normalization | Text line segmentation | Optical character recognition | Semantic analysis |
---|---|---|---|---|---|
❌ | ✔️ | ✔️ | ❌ | ❗ | ❌ |
-
By text outline detection:
-
Grayscale conversion
-
smoothing and histogram equalization
-
Pre-rotating the image:
-
First method (THIS WAS USED HERE)
-
Binarization with Gradient method (mathematical morphology)
-
Hough transform - to make text lines horizontal
-
-
Second method (ALSO TESTED):
-
Binarization - adaptive thresholding
-
Denoising
-
Vertical histogram from -10 to 10 degrees
-
-
-
Detecting the entire text outline and marking it in the image:
-
One of:
-
Canny's edge detection - BETTER
-
high-pass filter based on Sobel operator
-
-
morphological operation of erosion
-
-
Image cropping:
-
find all the contours based on the input image
-
the found outlines were filtered out
-
rectangles escribed on the given contours were found
-
the rectangle escribed on the whole set of contours was searched
-
finds the minimum rectangle containing a set of rectangles
-
-
- Thinning - K3M skeletonization algorithm - tested but not used, because gave worse results with stock OCR
- ABBYY FineReader
-
The authors had mainly difficult cases in mind – photos taken freehand in unfavorable lighting conditions.
-
inhomogeneous lighting conditions, cropping, different angles of images taken, non-linear distortions and sharpness of images
-
The following characteristics of the samples were considered:
-
Cropping – whether the entire receipt is visible, how much background is in the picture,
-
Lighting – it can be artificial or natural, strong or weak, shadows can be seen on the receipt,
-
Sharpness – whether the photo is sharp or blurred,
-
Angle of rotation – how much the photo deviates from the vertical position,
-
Folds – the receipt may be curled or folded.
-
-
Binarization methods tested:
- Otsu method
-
Otsu’s global method copes well with clear, sharp images with a good lighting
-
- 2 Adaptive methods:
-
For the first method, the threshold value T is a mean of the pixel intensities in the observation window.
-
For the second one, it is a weighted sum (cross-correlation with a Gaussian window) of this neighborhood.
-
In the adaptive methods along with the growing observation window, the text becomes less readable
-
-
the histogram equalization, due to the loss of some information, introduced disturbances and caused problems in the binarization. In the case of smoothing, only for low sigma values this has a positive effect on the result.
-
Comparing visually, the adaptive method with the equal weights in the observation window is the best
- Otsu method