This project is a machine learning solution that addresses the task of performing sentiment analysis on customer reviews. The goal of the project is to build a model that can classify customer reviews as positive, negative, or neutral based on their text. We will be using different NLP model to extract insights from the reviews. The project provides a clean and well-organized codebase, as well as detailed documentation and explanations for the various NLP techniques used.
- Aaron Antony Noronha 21BAI10296
- Anshu Kushwah 21BAI10141
- Abhit Yadav 21BAI10397
- Shambhavi Pandey 21BAI10453
- Simran Namdev 21BAI10472
The customer review analysis is a crucial task for any business that wants to understand the sentiment of its customers and improve its products or services. With the increasing amount of customer reviews available on the internet, it is important to have a tool that can automatically extract useful information from these reviews. This project aims to develop such a tool that can be used to perform customer review analysis on a dataset of reviews. The tool will be able to extract information such as customer sentiment, the most mentioned features of the product or service, and areas for improvement. The extracted information will be visualized in an easy-to-understand format, making it easy for the business to identify areas for improvement. Additionally, the tool will also be able to classify reviews as positive, negative or neutral.
Customers leave tons of reviews, advice, complaints in a business portal. Reading and understanding all these take a lot of manual effort, time, and costs. Can we develop a platform that can summarise different relevant metrics for our business like most recent reviews, Overall rating, distribution of sentiments, trending keywords, and so on?
The task is to perform sentiment analysis on customer reviews for a product. The goal is to use the text of the reviews to classify them as positive, negative, or neutral, and to extract insights about the customer sentiment towards the product or service. The performance of the model will be evaluated using standard metrics for text classification.
The dataset used in this project is a combination of web scraped data and already present big dataset of customer reviews for a specific product. The web scraped data was collected by using web scraping techniques and pre-processed to remove any irrelevant information. The combined dataset was then pre-processed to format it into a format that can be easily used for training and evaluating the model. The dataset is split into training and test sets, with a ratio of 80:20.
In this process we'll preparing data for analysis by removing or modifying the data that is irrelevant or duplicated.
Using following steps: Cleaning, Tokenization, Stop words removal("English word"), Lemmatization, Stemming.