Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when train tta with bert-base config and sequence length 512,got NAN #5

Open
yyht opened this issue Jan 13, 2021 · 3 comments
Open

Comments

@yyht
Copy link

yyht commented Jan 13, 2021

hi, I am trying to do traing bert-base using tta for chinese, it got NAN with 1000-step optimization, I am wondering if you could give me some advice

@joongbo
Copy link
Owner

joongbo commented Jan 14, 2021

hi, there is no problem when I tried to train tta with the bert-base config (for English).
did you try to train tta for English with the bert-base config, and get the same problem?

anyway, for a different language with a different vocabulary, you should modify one line in modeling.py.
at line 161 in the file, 4 is for the dummy token id of [MASK].
so if you change vocabulary, you should match this number with your vocabulary id of "[MASK]" for dummy_ids.

lastly, in my experience, examining data (pre)processing again would be helpful.

if you have any further problems, please feel free to ask me again!

thanks :)

@yyht
Copy link
Author

yyht commented Jan 14, 2021

thanks for your help. I made some mistakes for hyparameters and it could run normally.
Since you have done some experiments with bert-base config, i am wondering wheather tta could achieve better results on English data such as GLUE and sentence reranking on NMT and ASR ?

@joongbo
Copy link
Owner

joongbo commented Jan 14, 2021

unfortunately, not yet tested on any specific tasks. due to the lack of computing resources in my lab, I had to use a much smaller batch size (less than 10 I think) for pre-training tta with bert-base config. so I tried, but not completed to train tta with that config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants