Feature Request: Topic Modeling - Implement Latent Dirichlet Allocation (LDA) and NMF #317

HsiangNianian · 2024-11-17T14:52:31Z

Topic modeling is an unsupervised learning task used to discover abstract topics within a collection of documents. We'll implement LDA and NMF algorithms to extract topics from large text corpora.

Algorithm Choice: Should we implement both LDA and NMF for comparison, or focus on one?
Data Handling: How to preprocess the text (e.g., stop words removal, TF-IDF)?
Evaluation: How to evaluate topic coherence and interpretability?

Expected Outcome

Working implementations of LDA and NMF that can extract topics from a collection of documents.
Examples and usage guidelines for analyzing text corpora.

HsiangNianian added nlp feature topic-modeling labels Nov 17, 2024

HsiangNianian self-assigned this Nov 17, 2024

HsiangNianian added this to Development Nov 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Topic Modeling - Implement Latent Dirichlet Allocation (LDA) and NMF #317

Feature Request: Topic Modeling - Implement Latent Dirichlet Allocation (LDA) and NMF #317

HsiangNianian commented Nov 17, 2024

Feature Request: Topic Modeling - Implement Latent Dirichlet Allocation (LDA) and NMF #317

Feature Request: Topic Modeling - Implement Latent Dirichlet Allocation (LDA) and NMF #317

Comments

HsiangNianian commented Nov 17, 2024