text classification with various advanced modules and latest models such as leam, hard attention, multi-head attention
rnn, cnn, multi-head, soft-attention, multi-head attention, multi-head pooling with various encoder and attention and aggerator. Bert is on another yyht/BERT for various applications such as qa, mrc, sst, classifcation, knowledge distillation, adaptation of lm+finetuning