Preprocessing Photos of Receipts for Recognition

Wojciech Korobacz, Marek Tabędzki

@article{korobacz2018preprocessing,
	author = {Korobacz, Wojciech and Tabędzki, Marek},
	title = {{Preprocessing photos of receipts for recognition}},
	journal = {Advances in Computer Science Research},
	number = {14},
	year = {2018},
	issn = {2300-715X},
	doi = {10.24427/acsr-2018-vol14-0006}
}

Pipeline

Receipt detection	Receipt localization	Receipt normalization	Text line segmentation	Optical character recognition	Semantic analysis
❌	✔️	✔️	❌	❗	❌

Receipt localization

By text outline detection:
- Grayscale conversion
- smoothing and histogram equalization
- Pre-rotating the image:
  - First method (THIS WAS USED HERE)
    - Binarization with Gradient method (mathematical morphology)
    - Hough transform - to make text lines horizontal
  - Second method (ALSO TESTED):
    - Binarization - adaptive thresholding
    - Denoising
    - Vertical histogram from -10 to 10 degrees
- Detecting the entire text outline and marking it in the image:
  - One of:
    - Canny's edge detection - BETTER
    - high-pass filter based on Sobel operator
  - morphological operation of erosion
- Image cropping:
  - find all the contours based on the input image
  - the found outlines were filtered out
  - rectangles escribed on the given contours were found
  - the rectangle escribed on the whole set of contours was searched
  - finds the minimum rectangle containing a set of rectangles

Receipt normalization

Thinning - K3M skeletonization algorithm - tested but not used, because gave worse results with stock OCR

Optical character recognition

ABBYY FineReader

Notes

The authors had mainly difficult cases in mind – photos taken freehand in unfavorable lighting conditions.
inhomogeneous lighting conditions, cropping, different angles of images taken, non-linear distortions and sharpness of images
The following characteristics of the samples were considered:
- Cropping – whether the entire receipt is visible, how much background is in the picture,
- Lighting – it can be artificial or natural, strong or weak, shadows can be seen on the receipt,
- Sharpness – whether the photo is sharp or blurred,
- Angle of rotation – how much the photo deviates from the vertical position,
- Folds – the receipt may be curled or folded.
Binarization methods tested:
- Otsu method
  - Otsu’s global method copes well with clear, sharp images with a good lighting
- 2 Adaptive methods:
  - For the first method, the threshold value T is a mean of the pixel intensities in the observation window.
  - For the second one, it is a weighted sum (cross-correlation with a Gaussian window) of this neighborhood.
  - In the adaptive methods along with the growing observation window, the text becomes less readable
- the histogram equalization, due to the loss of some information, introduced disturbances and caused problems in the binarization. In the case of smoothing, only for low sigma values this has a positive effect on the result.
- Comparing visually, the adaptive method with the equal weights in the observation window is the best

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

korobacz2018preprocessing.md

korobacz2018preprocessing.md

Preprocessing Photos of Receipts for Recognition

Wojciech Korobacz, Marek Tabędzki

Pipeline

Receipt localization

Receipt normalization

Optical character recognition

Notes

Files

korobacz2018preprocessing.md

Latest commit

History

korobacz2018preprocessing.md

File metadata and controls

Preprocessing Photos of Receipts for Recognition

Wojciech Korobacz, Marek Tabędzki

Pipeline

Receipt localization

Receipt normalization

Optical character recognition

Notes