You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is affected by this issue:
Because of this piece of code, the model will kep training on step 0 if AdamWeightDecayOptimizer is used with TPU
Made changes to the code here.
This was done because when the code reaches the said line, the optimizer is not necessarily a AdamWeightDecayOptimizer object. It can be a CrossShardOptimizer object.
Thus, by change checking if the weight_decay flag is not equal to zero, we can conclusively say if the optimizer uses Adam weight decay or not.
Note: if FLAGS.weight_decay != 0 can be changed to if not FLAGS.weight_decay but that would hinder the code if someone has used None by mistake
Please let me know if any other changes are required
The text was updated successfully, but these errors were encountered:
@zihangdai @kimiyoung
I have raised a PR for this:
#146
What is affected by this issue:
Because of this piece of code, the model will kep training on step 0 if AdamWeightDecayOptimizer is used with TPU
Made changes to the code here.
This was done because when the code reaches the said line, the optimizer is not necessarily a AdamWeightDecayOptimizer object. It can be a CrossShardOptimizer object.
Thus, by change checking if the weight_decay flag is not equal to zero, we can conclusively say if the optimizer uses Adam weight decay or not.
Note:
if FLAGS.weight_decay != 0
can be changed toif not FLAGS.weight_decay
but that would hinder the code if someone has usedNone
by mistakePlease let me know if any other changes are required
The text was updated successfully, but these errors were encountered: