This repository contains code and resources for performing Natural Language Processing (NLP) analysis on budget speeches delivered by various finance ministers. The project leverages spaCy, Gensim, and Plotly to uncover insights from historical Indian budget speeches. Another approach using R has also been explored, you may find that in the second branch.
Topic Modeling (LDA) to identify recurring themes across years and finance ministers. Sentiment Analysis to analyze the emotional tone of speeches over time. Named Entity Recognition (NER) to extract important entities such as organizations, financial terms, and policies. Visualization Dashboards built with Dash to explore changing themes and sentiments over the years.
Preprocessing of speeches with spaCy (tokenization, lemmatization, custom stopwords). Topic Modeling using Latent Dirichlet Allocation (LDA). Sentiment Analysis with VADER and Transformer-based models. Interactive visualizations using Plotly and pyLDAvis. A web-based NLP dashboard built with Dash for real-time exploration of topics and sentiment trends.
Compare budget speech themes across finance ministers. Track the evolution of sentiment in budget speeches over time. Discover hidden topics and trends in government financial policies.