You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 27, 2021. It is now read-only.
Labels: Influence the Masses, Enhance the Social Norms, AI/DL, CNN, Image and Video Processing (openCV)
🔥 Your Pitch
Almost 3 million people in India and 30 million worldwide suffer from acute speech or hearing disability according to the World Disability Report by the International Journal of Speech-Language Pathology. The day to day life of these individuals is somewhat different but consists of almost everything every other human does, except communication. The most widely used language by such people is the American Sign Language, that consists of hand gestures to convey alphabetical, numerical letters as well as words. This language, however, is understood only by a minuscule of people with no disability, causing problems for the speech and hearing impaired.
Therefore, we decided to build a platform that takes in real-time videos of the ASL hand gestures and converts into the word(s) it corresponds to. In order to build that, we focused on four main things:
Data Creation We started with writing a python script that uses OpenCV among other libraries to create a 'hand histogram', in order to form a detection boundary of the perimeter of the palm and fingers. This will help later in gesture capture. Another python script takes in video gestures as inputs and outputs a folder with 1200 image frames of the same labeled beforehand. This adds scalability as almost all words from ASL can be incorporated without any strain while we only take 44 words/letter into training for the scope of the event.
Data Preparation Since we have 1200 grayscale labeled image frames of each gesture, end up with almost a hundred thousand images. We augment the data by flipping it, it also is noteworthy that this step is also necessary since the gestures are single-handed and flipping takes into consideration the case of left-handed people.
Model Training We train the data over a standard CNN and save the model in the h5 file provided. We encountered a training accuracy of 98.6% and a validation accuracy of 99%. Note: This is due to using a rather simple model over a moderate size of data to reduce overfitting. Also, the test accuracy reaches around 98%.
Prediction over live-feed The last python script is responsible to take in video input, convert it into grayscale and predict the gesture over the trained model. It prints the word(s) in real-time.
🔦 Any other specific thing you want to highlight?
We chose 'Influence the Masses' as the speech and hearing-impaired community has been suppressed for a long time due to their disability. This not only revolutionizes as to how these people will communicate in the future but also provides a more open, accepting take on the general people's perception of this community. This helps them become more 'abled'.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
ℹ️ Project information
Project Name: LetMeSpeak
Short Project Description: This is an AI-backed, real-time, sign language (ASL) gesture recognition platform.
Team ID: HX018
Team Name: not-a-bot
Team Members:
Tuhin Sarkar
Vatsal Rathod
Sarvesh Shroff
Demo Link: Youtube Video
Repository Link: https://github.com/vatsal-rathod/letmespeak
Labels: Influence the Masses, Enhance the Social Norms, AI/DL, CNN, Image and Video Processing (openCV)
🔥 Your Pitch
Almost 3 million people in India and 30 million worldwide suffer from acute speech or hearing disability according to the World Disability Report by the International Journal of Speech-Language Pathology. The day to day life of these individuals is somewhat different but consists of almost everything every other human does, except communication. The most widely used language by such people is the American Sign Language, that consists of hand gestures to convey alphabetical, numerical letters as well as words. This language, however, is understood only by a minuscule of people with no disability, causing problems for the speech and hearing impaired.
Therefore, we decided to build a platform that takes in real-time videos of the ASL hand gestures and converts into the word(s) it corresponds to. In order to build that, we focused on four main things:
Data Creation We started with writing a python script that uses OpenCV among other libraries to create a 'hand histogram', in order to form a detection boundary of the perimeter of the palm and fingers. This will help later in gesture capture. Another python script takes in video gestures as inputs and outputs a folder with 1200 image frames of the same labeled beforehand. This adds scalability as almost all words from ASL can be incorporated without any strain while we only take 44 words/letter into training for the scope of the event.
Data Preparation Since we have 1200 grayscale labeled image frames of each gesture, end up with almost a hundred thousand images. We augment the data by flipping it, it also is noteworthy that this step is also necessary since the gestures are single-handed and flipping takes into consideration the case of left-handed people.
Model Training We train the data over a standard CNN and save the model in the h5 file provided. We encountered a training accuracy of 98.6% and a validation accuracy of 99%. Note: This is due to using a rather simple model over a moderate size of data to reduce overfitting. Also, the test accuracy reaches around 98%.
Prediction over live-feed The last python script is responsible to take in video input, convert it into grayscale and predict the gesture over the trained model. It prints the word(s) in real-time.
🔦 Any other specific thing you want to highlight?
We chose 'Influence the Masses' as the speech and hearing-impaired community has been suppressed for a long time due to their disability. This not only revolutionizes as to how these people will communicate in the future but also provides a more open, accepting take on the general people's perception of this community. This helps them become more 'abled'.
The text was updated successfully, but these errors were encountered: