Conversation Summarization Model with Google Pegasus

Description

This Jupyter notebook showcases a Conversation Summarization Model using the Google Pegasus model from the Hugging Face Transformers library. The model is fine-tuned on the SAMSum dataset, which contains dialogues and their corresponding summaries. The goal of the model is to generate concise summaries of chat conversations.

Requirements

To run this notebook, you need the following dependencies:

Python 3.x
Hugging Face Transformers
Datasets library
Matplotlib
Pandas
NLTK
Tqdm

You can install the required packages using pip:

pip install transformers datasets matplotlib pandas nltk tqdm

How to Use

Install the required dependencies as mentioned above.
Clone this repository to your local machine:

git clone https://github.com/your-username/your-repo.git
cd your-repo

Open the Jupyter notebook Conversation_Summarization_Model_Google_Pegasus.ipynb in your Jupyter environment.
Make sure you have access to a GPU if you want to leverage GPU acceleration during the model training and inference.
Run the notebook cells in sequential order. The notebook will guide you through the following steps:
Loading the SAMSum dataset.

Setting up the Google Pegasus model and tokenizer.
Preprocessing the data and converting it into appropriate input encodings.
Fine-tuning the model using the Trainer class from the Transformers library.
Evaluating the model on the validation dataset and tracking the training progress.
Generating summaries for sample dialogues and displaying the results.

Dataset and Model

The SAMSum dataset contains chat conversations and their corresponding summaries, making it suitable for training a conversation summarization model. We used the Google Pegasus model, a state-of-the-art transformer-based model for sequence-to-sequence tasks, and fine-tuned it on the SAMSum dataset.

Performance Metrics

During training, we used the evaluation strategy to compute validation loss. Additionally, we used the ROUGE metric to evaluate the generated summaries against the ground truth summaries for the test dataset.

Results

After training, the model should be capable of generating informative and concise summaries for chat conversations. The notebook will demonstrate the model's performance on sample dialogues and their generated summaries.

Acknowledgments

I would like to thank Hugging Face for providing the Transformers library, which made it easy to access and fine-tune state-of-the-art language models like Google Pegasus.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows/.gitkeepsrc/Text-Summarizer		.github/workflows/.gitkeepsrc/Text-Summarizer
config		config
research		research
src		src
Conversation_Summarization_Model_Google_Pegasus.ipynb		Conversation_Summarization_Model_Google_Pegasus.ipynb
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversation Summarization Model with Google Pegasus

Description

Requirements

How to Use

Dataset and Model

Performance Metrics

Results

Acknowledgments

About

Releases

Packages

Languages

123Satyajeet123/text-summarizer

Folders and files

Latest commit

History

Repository files navigation

Conversation Summarization Model with Google Pegasus

Description

Requirements

How to Use

Dataset and Model

Performance Metrics

Results

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages