The project is about processing lyrics of English-speaking artists.
Libraries: numpy, pandas, BeautifulSoup4, requests, re, nltk, sklearn, scipy, matplolib
- Datasets
- Machine learning approaches
- The Trainer 1: Fill in the gaps
- The Trainer 2: Find the signal word
- The code collects the library of discographies from https://www.allthelyrics.com/.
-
The first problem is clustering of songs of bands with Ronnie James Dio. The given problem was solved by using the scikit-learn K-means clustering in the clustering_of_Dio_bands.ipynb notebook. The result is in the dio_clusters.csv file.
-
The second problem is building a text-based recommender by one or a few songs. The given problem was solved by using the scikit-learn cosine similarity and Tf-idf vectorizer in the recommender_lyrics.ipynb notebook.
- Clone the project:
git clone https://github.com/am-tropin/english-song-lyrics
- Go to the project directory:
cd english-song-lyrics
- Start the server:
uvicorn main:app --reload
- Enter an artist name from the list above in the terminal, for example:
florence_the_machine
- Copy a song name from the list above in the terminal, for example:
No Light No Light
- Go to web-browser
http://127.0.0.1:8000/docs/
and use the following box:
- Get Top Recommended Lyrics. Type artist and song names, and amount of songs in the future top list.
Or
- Go to web-browser and use the following link to get the same info after typing the parameters:
http://127.0.0.1:8000/top/_
Or
- Go to web-browser and use the following type of links to get the same info in clear dictionary view:
http://127.0.0.1:8000/top_html/florence_the_machine_No Light No Light_8
- The trainer of listening. You can choose the artist, their song and the frequency of replacing random words with gaps. Open the lyrics with gaps (in file with a name like "lyrics_with_gaps_###.txt") in a new window. When you're listening to the song, fill in the gaps in this trainer and know your score!
- The trainer of grammar. You can find all uses of the signal word in lyrics in our song collection. Set the signal word in this trainer and know how to use it correctly!