Thanks to you! And prompting bug #15

6nl · 2023-10-23T12:17:46Z

6nl
Oct 23, 2023

Hi, thanks so much for his repo Niranjan! I am doing a project on top of it, and so I really appreciate what you've done here.

My project involves translation, not transcription. So I need to set prompts. I have been struggling to get the tflite model to generate any useful output at all when I include prompts. It generates tokens of value zero. I think this might be an issue you have had too, judging by the issue you raised here: huggingface/transformers#19691 (comment)

After days of tracing, I have a fix, if it's still a problem for you.

nyadla-sys · 2023-10-23T15:36:20Z

nyadla-sys
Oct 23, 2023
Maintainer

@6nl Could you please share the colab notebook that is working for this issue.
if you want to keep it private, please share it to my email [email protected]

0 replies

6nl · 2023-10-23T19:42:08Z

6nl
Oct 23, 2023
Author

Here is the monkey patch that got it working for me. I can set forced_decoder_ids in the generate call now.

The issue was that in the huggingface transformers code, to force a token they set the scores for all tokens to -tf.float(inf) and then update the score for the forced token to 0, so it gets chosen. But in tflite the -inf value sometimes gets rounded and ends up as a NaN. Its the closest value to a overflow error that you can get. So I just replaced it with -1 as it is still less than 0.

Include the cell below before your generate call.

import tensorflow as tf
import numpy as np
from transformers import TFForceTokensLogitsProcessor, TFLogitsProcessor
from typing import List, Optional, Union, Any

# Patching methods of class TFForceTokensLogitsProcessor(TFLogitsProcessor):

def my__init__(self, force_token_map: List[List[int]]):
    force_token_map = dict(force_token_map)
    # Converts the dictionary of format {index: token} containing the tokens to be forced to an array, where the
    # index of the array corresponds to the index of the token to be forced, for XLA compatibility.
    # Indexes without forced tokens will have an negative value.
    force_token_array = np.ones((max(force_token_map.keys()) + 1), dtype=np.int32) * -1
    for index, token in force_token_map.items():
        if token is not None:
            force_token_array[index] = token
    self.force_token_array = tf.convert_to_tensor(force_token_array, dtype=tf.int32)

def my__call__(self, input_ids: tf.Tensor, scores: tf.Tensor, cur_len: int) -> tf.Tensor:
    def _force_token(generation_idx):
        batch_size = scores.shape[0]
        current_token = self.force_token_array[generation_idx]

        # Original code below generates NaN values when the model is exported to tflite
        # it just needs to be a negative number so that the forced token's value of 0 is the largest
        # so it will get chosen
        #new_scores = tf.ones_like(scores, dtype=scores.dtype) * -float("inf")
        new_scores = tf.ones_like(scores, dtype=scores.dtype) * -float(1)
        indices = tf.stack((tf.range(batch_size), tf.tile([current_token], [batch_size])), axis=1)
        updates = tf.zeros((batch_size,), dtype=scores.dtype)
        new_scores = tf.tensor_scatter_nd_update(new_scores, indices, updates)
        return new_scores

    scores = tf.cond(
        tf.greater_equal(cur_len, tf.shape(self.force_token_array)[0]),
        # If the current length is geq than the length of force_token_array, the processor does nothing.
        lambda: tf.identity(scores),
        # Otherwise, it may force a certain token.
        lambda: tf.cond(
            tf.greater_equal(self.force_token_array[cur_len], 0),
            # Only valid (positive) tokens are forced
            lambda: _force_token(cur_len),
            # Otherwise, the processor does nothing.
            lambda: scores,
        ),
    )
    return scores

TFForceTokensLogitsProcessor.__init__ = my__init__
TFForceTokensLogitsProcessor.__call__ = my__call__

5 replies

nyadla-sys Oct 23, 2023
Maintainer

Great! and its working as expected

6nl Oct 25, 2023
Author

great, pleased to hear it's working for you. I did a collab to demonstrate it before I saw your reply! Here it is: https://colab.research.google.com/drive/1YHaidFIq9Wv-89qT_bHTVEkaBDE-bjiV?usp=sharing

6nl Oct 25, 2023
Author

Do you think we should submit a bug / PR on the HF repo? Really they shouldn't use -inf values in the way they do. It would be a simple fix to declare a constant that is close to the most negative float value and use that. However they didn't seem very interested in tflite support in the thread you created....... and it works fine in tf.

nyadla-sys Oct 25, 2023
Maintainer

Yes,please raise #pr with huggingface and tag couple of ppl from hf refer my hf issue to whom to tag

KihongK Jun 20, 2024

It's not working right now :<

But openai-whisper small model is work well...

small O
tiny, base X

Solved by changing the transformer version

!pip install transformers==4.33.0
!pip install tensorflow==2.14.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thanks to you! And prompting bug #15

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Thanks to you! And prompting bug #15

6nl Oct 23, 2023

Replies: 2 comments · 5 replies

nyadla-sys Oct 23, 2023 Maintainer

6nl Oct 23, 2023 Author

nyadla-sys Oct 23, 2023 Maintainer

6nl Oct 25, 2023 Author

6nl Oct 25, 2023 Author

nyadla-sys Oct 25, 2023 Maintainer

KihongK Jun 20, 2024

6nl
Oct 23, 2023

Replies: 2 comments 5 replies

nyadla-sys
Oct 23, 2023
Maintainer

6nl
Oct 23, 2023
Author

nyadla-sys Oct 23, 2023
Maintainer

6nl Oct 25, 2023
Author

6nl Oct 25, 2023
Author

nyadla-sys Oct 25, 2023
Maintainer