Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI Client format + chat template for a single call #2644

Open
1 of 4 tasks
vitalyshalumov opened this issue Oct 14, 2024 · 1 comment
Open
1 of 4 tasks

OpenAI Client format + chat template for a single call #2644

vitalyshalumov opened this issue Oct 14, 2024 · 1 comment

Comments

@vitalyshalumov
Copy link

System Info

latest docker

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Hello,
Can you tell me please how to implement the following functionality combined:

  1. I'm interested in OpenAI Client format:
    prompt= [
    {"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
    {"role": "user", "content": "Hey, can you tell me any fun things to do in New York?"}
    ]

  2. I want to make sure that chat template of the served odel is aplied

  3. I dont want a chat - I want that each call with a prompt statrt from a clear history to aoid token overflow.
    Thank you!

Expected behavior

An answer for each prompt indepedent of the previos anser, but with OpenAI client API

@Johnno1011
Copy link

Johnno1011 commented Oct 18, 2024

You could still use the openai.chat.completions.create but reset the chat history each time? For example:

def generate(prompt: str) -> ChatCompletion:
    messages = [
        {
            "role": "system",
            "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986.",
        },
        {"role": "user", "content": prompt},
    ]
    return openai.chat.completions.create(messages=messages, model='TGI', base_url='TGI URL?', api_key'TGI')

Alternatively, you could apply the chat template outside of TGI like this:

import openai
from transformers import AutoTokenizer
def generate(prompt: str) -> Completion:
    messages = [{"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."}, {"role": "user", "content": prompt}]
    
    # Load the tokenizer locally, apply chat template outside of TGI
    tokenizer = AutoTokenizer.from_pretrained('your_model_of_choice')
    prompt = tokenizer.apply_chat_template(prompt, tokenize=False)
    
    # call standard generate route
    return openai.completions.create(prompt=prompt, base_url="TGI_URL", model="TGI", api_key="TGI")

Using the tokenizer locally is pretty fast computationally so I wouldn't worry.

If these examples don't help, if you could share more details about what you're trying to achieve I'll try and help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants