Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama2-7b output text has irrelevant \n # characters for QA #83

Open
DeekshithaDPrakash opened this issue Nov 7, 2024 · 0 comments
Open

Comments

@DeekshithaDPrakash
Copy link

DeekshithaDPrakash commented Nov 7, 2024

I finetuned llama2-ko-7b with LORA for Answering Questions based on the Context.

My training data was jsonl file with multiple texts:

  • Example: { "text": "### Instruction:\n{question}\n\n### Input:\n{context}\n\n### Response:\n{answer}." }

Model was trained for 20 epochs and I am trying to inference on triton server

I am facing output text issue!⬇️

The output always generates [\n\n\n or ### or Input] after the first sentence.

  • I tried:
    "max_tokens": 30,
    "bad_words": ["\n\n###", "###"],
    "stop_words": ["\n\n###", ".", "!"],
    "pad_id": 2,
    "end_id": 2,
    "streaming": 1,
    "early_stopping": true,
    "temperature": 1.0,
    "top_k": 50,
    "top_p": 0.92,
    "no_repeat_ngram_size": 3,
    "eos_token_id": 2,
    "num_beams": 1,
    "do_sample": true
    }'

  • Example: "text_output":"경관계획은 실시설계를 완료하기 전에 수립해야 합니다. \n\n##\n\n \t\n\n \t\n\n \t"

Q: How can I prevent this issue during inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant