TransEmoC: A Transformer-Based Unified Toolkit for Emotion Classification in a Low Resource Language

This git repo is a part of the Project Emotion Classification in a Resource Constrained Language. The proposed model described in the corresponding paper is wrapped up in this git project for reuse by developers and researchers.

Find the reproducible capsule here.

Usage

Clone the github repository

git clone https://github.com/avishek-018/TransEmoC.git

Change directory

cd TransEmoC

If you are using colab ignore steps above and proceed with the following code snippet

!git clone https://github.com/avishek-018/TransEmoC.git
!mv  -v TransEmoC/* /content
!rm -rf TransEmoC

Install the requirements

pip install -r requirements.txt

Run train.py with you data and specify the hyperparameter(optional)

python train.py  --train <"path to train data(xlsx/csv)"> --val <"path to val data(xlsx/csv)"> --text_col <"text data column name"> --model <"model name"> --label_col  <"class/label column name"> --epochs <num of epochs> --batch_size <batch size> --lr <learning rate> --maxlen <max text length>

Required arguments:

--train
--val
--text_col
--label_col

And the optional arguments are (otherwise proposed hyperpameters are used)

--model
--epochs
--batch_size
--lr
--maxlen

Exmaple for this repository:

python train.py  --train "data/train.xlsx" --val "data/val.xlsx" --text_col "cleaned" --label_col "classes" --model "xlm-roberta-base" --epochs 20 --batch_size 12 --lr 2e-5 --maxlen 70

Run evaluate.py to evaluate your model with test data

python evaluate.py --test <"path to test data(xlsx/csv)"> --text_col <"text data column name"> --label_col "classes" --model <"/path/to/model">

Required arguments:

--test
--text_col
--label_col

And the optional arguments are (otherwise default model location used)

--model

Exmaple for this repository:

python evaluate.py --test "data/test.xlsx" --text_col "cleaned" --label_col "classes" --model "model"

Output

A result.txt file is generated

Run predict.py to get the prediction for a text

python predict.py --sentence "your text goes here" --model <"/path/to/model">

Exmaple for this repository:

python predict.py --sentence "ছেলেটা তিন দিন ধরে না খেয়ে আছে" --model "model"

Output:

Cite this work

If you find this repository helpful in your work please cite the following

@inproceedings{das-etal-2021-emotion,
    title = "Emotion Classification in a Resource Constrained Language Using Transformer-based Approach",
    author = "Das, Avishek  and
      Sharif, Omar  and
      Hoque, Mohammed Moshiul  and
      Sarker, Iqbal H.",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-srw.19",
    pages = "150--158",
    abstract = "Although research on emotion classification has significantly progressed in high-resource languages, it is still infancy for resource-constrained languages like Bengali. However, unavailability of necessary language processing tools and deficiency of benchmark corpora makes the emotion classification task in Bengali more challenging and complicated. This work proposes a transformer-based technique to classify the Bengali text into one of the six basic emotions: anger, fear, disgust, sadness, joy, and surprise. A Bengali emotion corpus consists of 6243 texts is developed for the classification task. Experimentation carried out using various machine learning (LR, RF, MNB, SVM), deep neural networks (CNN, BiLSTM, CNN+BiLSTM) and transformer (Bangla-BERT, m-BERT, XLM-R) based approaches. Experimental outcomes indicate that XLM-R outdoes all other techniques by achieving the highest weighted f{\_}1-score of 69.73{\%} on the test data.",
}

Note

If you find any anomaly or have any query/suggestion feel free to ping. Pull requests are welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
data		data
images		images
model		model
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TransEmoC: A Transformer-Based Unified Toolkit for Emotion Classification in a Low Resource Language

Usage

If you are using colab ignore steps above and proceed with the following code snippet

Cite this work

Note

About

Releases 1

Packages

Languages

License

avishek-018/TransEmoC

Folders and files

Latest commit

History

Repository files navigation

TransEmoC: A Transformer-Based Unified Toolkit for Emotion Classification in a Low Resource Language

Usage

If you are using colab ignore steps above and proceed with the following code snippet

Cite this work

Note

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages