Extractor

Features

Extract text from images or documents using Azure AI Computer Vision.
Translate extracted English text to multiple languages with the Azure AI Translator.
Supports various image formats, including JPEG, PNG, and PDF.
User-friendly interface for easy image upload and documents.
Seamless integration with Streamlit for interactive usage.

Project Details

Azure Services Used

→This project utilizes a total of three Azure Technologies which are

Azure AI Services | Computer Vision
Azure AI Services | Translator
Azure App Service

→Brief Description on the Services Used:

Azure AI Services
- Azure AI Translator: To provide translation services for the extracted text to multiple languages.
- Azure Computer Vision: To perform optical character recognition(OCR) and extract text from images or documents.
Azure App Service: To host the streamlit on Azure portal.

Azure AI Services | Computer Vision

In this project, this service is employed to perform character extraction on images(PNG, JPEG). It can effortlessly extract text from the mentioned and even PDF files. With its robust capabilities, it's an essential component for extracting text from scanned documents or images and making it available for further processing within your application.
It takes in any type of document or images written in English and is sent to the service to extract data from it
Inside Computer Vision Studio under the Optical Character Recognition , the feature Extract Text from Image is used to do the work of getting any complicated written text from the different formates.
The given image below is the example of an extracted text from a pdf file into a json format
After the text is extracted it sent one by one inside a for-loop to the Azure Translator API.
Below given is the Azure postal under Computer Vision:

Azure AI Services | Translator

This Azure Service plays a crucial role in making the application multilingual and accessible to a global audience. This service is used to translate the extracted English text into multiple languages. It enables the application to break language barriers, providing seamless communication and understanding for users regardless of their language preferences. This feature is especially valuable in applications where content needs to be translated or localized, broadening the reach and impact of your project.
In this the extracted text is sent and then each line-by-line is translated from English to any preferred language provided in the application.
Below given image is an example of the translated English text into "Spanish" and like-wise the user can translated into any desired language from English to any language. This is the translated language from the previous previous example in a JSON format.
Below given is the Azure postal under Translator:

Azure App Service

Azure App Service serves as the hosting platform for the application's user interface. With this service, I can deploy my application in a convenient and scalable manner. It allows me to focus on the development of my application without the need to manage the underlying infrastructure. This simplifies the deployment process and ensures that the application is easily accessible to users via the Azure portal. Azure App Service provides a robust and reliable environment for your Streamlit-based application, making it available to a broad audience.
The entire code related from extraction, translations to Streamlit(web-application) is pushed to the Github.
After the code is pushed successfully, the github project URL is then given to the Azure App service and then the deployment starts automatically.
Below is the deployment status of the website.
Below given is the Azure postal under App Service:

Python Package

The project uses the following Python libraries:

azure-cognitiveservices-vision-computervision: Python SDK for Azure Computer Vision.
requests: For making HTTP requests to the Azure Translator API.
streamlit: For creating the user interface and interactive web app.
opencv-python: For capturing and processing images from the webcam.
#created/requirements.txt

Usage

Clone the repository:

git clone https://github.com/yourusername/your-repo.git
cd your-repo

Install the required dependencies:

pip install -r requirements.txt

Run the Streamlit app:

streamlit run main_script.py

Steps to Use

First we select any image or PDF to upload by clicking on the Browse files button.

After the the file is selected then the Azure AI Computer Vision processes and then a extracted text is displayed below.

This is the extracted text from the above image:

Then we go down and select the preferred language you want to translate.
Once that is selected the Azure AI Translator API translates and displays the text below.
In this example I have chosen Arabic:

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
azure_ocr.py		azure_ocr.py
azure_translator.py		azure_translator.py
main_script.py		main_script.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extractor

Features

Project Details

Azure Services Used

Azure AI Services | Computer Vision

Azure AI Services | Translator

Azure App Service

Python Package

Usage

Steps to Use

Screenshots

About

Packages

Contributors 2

Languages

License

sho6000/Extractor

Folders and files

Latest commit

History

Repository files navigation

Extractor

Features

Project Details

Azure Services Used

Azure AI Services | Computer Vision

Azure AI Services | Translator

Azure App Service

Python Package

Usage

Steps to Use

Screenshots

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Contributors 2

Languages

Packages