You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I am training a language model similar to one in Sparse Text Generation project with custom input format. When I start training it can not calculate an entmax loss.
My inputs and labels both has shapes (batch_size, seq_len) before went to loss. Afterwards (batch_size*seq_len, vocab_size) and (batch_size*seq_len,) respectively. I use masking via -1 in labels and despite I set ignore_index=-1 , my log is:
Traceback (most recent call last): │
File "run_lm_finetuning.py", line 782, in <module> │
main() │
File "run_lm_finetuning.py", line 736, in main │
global_step, tr_loss = train(args, train_dataset, model, tokenizer, gen_func) │
File "run_lm_finetuning.py", line 300, in train │
outputs = model(inputs, labels=labels) │
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 880, in _call_impl │
result = self.forward(*input, **kwargs) │
File "/app/src/pytorch_transformers/modeling_gpt2.py", line 607, in forward │
loss = self.loss(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1)) │
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 880, in _call_impl │
result = self.forward(*input, **kwargs) │
File "/usr/local/lib/python3.6/dist-packages/entmax/losses.py", line 17, in forward │
loss = self.loss(X, target) │
File "/usr/local/lib/python3.6/dist-packages/entmax/losses.py", line 278, in loss │
return entmax_bisect_loss(X, target, self.alpha, self.n_iter) │
File "/usr/local/lib/python3.6/dist-packages/entmax/losses.py", line 242, in entmax_bisect_loss │
return EntmaxBisectLossFunction.apply(X, target, alpha, n_iter) │
File "/usr/local/lib/python3.6/dist-packages/entmax/losses.py", line 129, in forward │
ctx, X, target, alpha, proj_args=dict(n_iter=n_iter) │
File "/usr/local/lib/python3.6/dist-packages/entmax/losses.py", line 45, in forward │
p_star.scatter_add_(1, target.unsqueeze(1), torch.full_like(p_star, -1)) │
RuntimeError: index -1 is out of bounds for dimension 1 with size 50257
How to fix this?
UPD:
I realized that the problem is not connected with ignore_index, but with shapes missmatch between target and p_star in forward method of _GenericLossFunction class. Still don't know hot to fix this bug. So, help me please, if somebody know how :)
The text was updated successfully, but these errors were encountered:
Hi! I am training a language model similar to one in Sparse Text Generation project with custom input format. When I start training it can not calculate an entmax loss.
My inputs and labels both has shapes (batch_size, seq_len) before went to loss. Afterwards
(batch_size*seq_len, vocab_size)
and(batch_size*seq_len,)
respectively. I use masking via -1 in labels and despite I setignore_index=-1
, my log is:How to fix this?
UPD:
I realized that the problem is not connected with
ignore_index
, but with shapes missmatch betweentarget
andp_star
in forward method of_GenericLossFunction
class. Still don't know hot to fix this bug. So, help me please, if somebody know how :)The text was updated successfully, but these errors were encountered: