Skip to content

jakubpernis/review_wordcloud

Repository files navigation

Trustpilot reviews wordcloud

Small utility for quick glance on what the Trustpilot reviews of some company are about.

Firstly, you need to scrape those reviews from the Trustpilot website (unless you are willing to pay for their business API). After running python scrape_reviews.py you will be prompted about which company's reviews would you like to check (specified by a url of that company on Trustpilot), how many pages of reviews you want to scrape and where to save the data.

Below is an example which I used to get the data which are then used in wordcloud.ipynb Jupyter notebook.

python scrape_reviews.py

Url of the first page of company reviews: https://www.trustpilot.com/review/kiwi.com
Number of pages to scrape: 500
Path where to store the result (csv): data/reviews.csv

In stored data there's a column review_text (among others) which contains the raw text of the reviews. Wordcloud can be constructed directly from the raw text, it's nicely described in this tutorial on Datacamp. However, there's also an option to construct wordcloud using tokens frequencies, which is the approach we chose to use because we can preprocess the raw text more flexibly and in any way we want. For that purpose we defined custom TextProcessor class.

Requires Python >= 3.8.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published