Questions about SkipBERT #1

jonsaadfalcon · 2022-08-03T19:15:10Z

Hello, I was interested in experimenting with the SkipBERT architecture. I was wondering if you could help with a few questions about implementation:

In the example on the ReadME, it seems like an input needs to be run at inference and then run again before we see the speed-up. Assuming the model checkpoint is a SkipBERT model that was already finetuned on the training set, could we still see the same improvement to inference time without running that specific input twice?
Is there any script or notebook for training our own SkipBERT models?
The second model checkpoint on the Github page, SkipBERT6+4, does not seem to be working. I was wondering if the checkpoint was available for usage.

Thank you for the help!

LorrinWWW · 2022-08-04T13:37:38Z

Thank for your interests in our work!

In the example on the ReadME, it seems like an input needs to be run at inference and then run again before we see the speed-up. Assuming the model checkpoint is a SkipBERT model that was already finetuned on the training set, could we still see the same improvement to inference time without running that specific input twice?

A: The input tri-grams need to be cached before seeing speed-up when config.plot_mode = 'plot_passive'.
If we allow some tri-grams to be OOV, we can set the config.plot_mode = 'plot_only', which we will see constant speed-up (though the accuracy will be hurt if OOV rate is too high).

Regarding 'plot_mode':

force_compute: compute tri-grams on demand. (usually used for training)
update_all: compute tri-grams, bi-grams, and uni-grams and write them to PLOT. (usually used to update PLOT)
plot_passive: use PLOT if no OOV; else it will use GPU to compute tri-grams and write them to PLOT.
plot_only: use PLOT only. Looking up order: trigram -> bigram -> unigram -> 0.

Is there any script or notebook for training our own SkipBERT models?

A: The code for training SkipBERT is under general_distillation.

We use distillation to train SkipBERT, but it should be feasible to train with MLM objective or other pretraining scheme.

The second model checkpoint on the Github page, SkipBERT6+4, does not seem to be working. I was wondering if the checkpoint was available for usage.

A: We are so sorry for the mistake. It was supposed be released before. We will upload it again quickly.

jonsaadfalcon · 2022-08-04T20:38:17Z

Thank you for the information! In regards to training our own SkipBERT models, I'm assuming we are supposed to use the run_train.sh script. I tried configuring it to run training but I'm having some issues with the dependencies and setup on a single GPU system. I was wondering if there was any example instructions for getting started.

Additionally, is it possible to calculate OOV at inference and see what percentage of trigrams encountered are OOV?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about SkipBERT #1

Questions about SkipBERT #1

jonsaadfalcon commented Aug 3, 2022

LorrinWWW commented Aug 4, 2022

jonsaadfalcon commented Aug 4, 2022

Questions about SkipBERT #1

Questions about SkipBERT #1

Comments

jonsaadfalcon commented Aug 3, 2022

LorrinWWW commented Aug 4, 2022

jonsaadfalcon commented Aug 4, 2022