You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running into two (possibly related) issues with this approach.
First, it errors out on this line: sent_logits[:, self.tokenizer.pad_token_id] = float("-inf"), with what seems to be an off-by-one indexing error.
/content/drive/MyDrive/Repos/lm-scorer/lm_scorer/models/automodel.py in _tokens_log_prob_for_batch(self, text)
66 # logits.shape = [len(text[sent_index]) + 1, vocab_size]
67 sent_logits = logits[sent_index, sent_nopad_mask][:-1, :]
---> 68 sent_logits[:, self.tokenizer.pad_token_id] = float("-inf")
69 # ids_scores.shape = [seq_len + 1]
70 sent_ids_scores = sent_logits.gather(1, sent_ids.unsqueeze(1)).squeeze(1)
IndexError: index 52001 is out of bounds for dimension 1 with size 52001
If I comment out this line and let it continue, I get back probabilities, but they seem to be odd. Probabilities of the first token and the endoftext token are both very low compared to the English model on a matched sentence. For example, compare French
I would like to adapt this library to work with user-contributed multilingual models from the
transformers
library.I tried to add another model class in a fork to handle
AutoModelWithLMHead
models here: https://github.com/smeylan/lm-scorer/blob/master/lm_scorer/models/automodel.py, just substituting the transformer model class (GPT2LMHeadModel -> AutoModelWithLMHead)I am running into two (possibly related) issues with this approach.
First, it errors out on this line:
sent_logits[:, self.tokenizer.pad_token_id] = float("-inf")
, with what seems to be an off-by-one indexing error.If I comment out this line and let it continue, I get back probabilities, but they seem to be odd. Probabilities of the first token and the endoftext token are both very low compared to the English model on a matched sentence. For example, compare French
vs. English
The same also holds for German (i.e. it follows the pattern fo French), so I don't think it's a model-specific problem.
Any help appreciated figuring out how
AutoModelWithLMHead
might differ fromGPT2LMHeadModel
!The text was updated successfully, but these errors were encountered: