This Python script performs Optical Character Recognition (OCR) on an image using the Tesseract OCR engine. The script uses the pytesseract
library along with the OpenCV and Pillow libraries for image processing and display.
- Python 3.x
- Tesseract OCR installed (Tesseract)
- Required Python packages:
pytesseract
,Pillow
,opencv-python
-
Install the required Python packages:
pip install pillow pytesseract opencv-python
-
Set the Tesseract executable path in the script:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
Replace the path with the correct location of your Tesseract executable.
-
Adjust the
filename
variable to point to the image you want to process. -
Run the script:
python image_ocr.py
- Extracts text from an image using Tesseract OCR.
- Filters out text with confidence less than 40.
- Displays the image with bounding boxes around recognized text.
This Python script performs Optical Character Recognition (OCR) on a video file using the Tesseract OCR engine. The script uses the pytesseract
, opencv-python
, and Pillow
libraries for OCR, video processing, and display.
- Python 3.x
- Tesseract OCR installed (Download Tesseract)
- Required Python packages:
pytesseract
,Pillow
,opencv-python
-
Install the required Python packages:
pip install pillow pytesseract opencv-python
-
Set the Tesseract executable path in the script:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
Replace the path with the correct location of your Tesseract executable.
-
Adjust the
video_path
variable to point to the video file you want to process. -
Run the script:
python video_ocr.py
-
Press 'q' to exit the video display.
- Processes a video file and performs OCR on selected frames.
- Skips frames to speed up processing (adjustable with
skip_frames
). - Displays the video with annotated text using bounding boxes.