You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, there is no problem when I tried to train tta with the bert-base config (for English).
did you try to train tta for English with the bert-base config, and get the same problem?
anyway, for a different language with a different vocabulary, you should modify one line in modeling.py.
at line 161 in the file, 4 is for the dummy token id of [MASK].
so if you change vocabulary, you should match this number with your vocabulary id of "[MASK]" for dummy_ids.
lastly, in my experience, examining data (pre)processing again would be helpful.
if you have any further problems, please feel free to ask me again!
thanks for your help. I made some mistakes for hyparameters and it could run normally.
Since you have done some experiments with bert-base config, i am wondering wheather tta could achieve better results on English data such as GLUE and sentence reranking on NMT and ASR ?
unfortunately, not yet tested on any specific tasks. due to the lack of computing resources in my lab, I had to use a much smaller batch size (less than 10 I think) for pre-training tta with bert-base config. so I tried, but not completed to train tta with that config.
hi, I am trying to do traing bert-base using tta for chinese, it got NAN with 1000-step optimization, I am wondering if you could give me some advice
The text was updated successfully, but these errors were encountered: