New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Llama2-7b output text has irrelevant \n # characters for QA #83

Open

DeekshithaDPrakash opened this issue Nov 7, 2024 · 0 comments

DeekshithaDPrakash commented Nov 7, 2024 •

edited

Loading

I finetuned llama2-ko-7b with LORA for Answering Questions based on the Context.

My training data was jsonl file with multiple texts:

Example: { "text": "~~### Instruction:\n{question}\n\n### Input:\n{context}\n\n### Response:\n{answer}.~~" }

Model was trained for 20 epochs and I am trying to inference on triton server

I am facing output text issue!⬇️

The output always generates [\n\n\n or ### or Input] after the first sentence.

I tried:
"max_tokens": 30,
"bad_words": ["\n\n###", "###"],
"stop_words": ["\n\n###", ".", "!"],
"pad_id": 2,
"end_id": 2,
"streaming": 1,
"early_stopping": true,
"temperature": 1.0,
"top_k": 50,
"top_p": 0.92,
"no_repeat_ngram_size": 3,
"eos_token_id": 2,
"num_beams": 1,
"do_sample": true
}'
Example: "text_output":"경관계획은 실시설계를 완료하기 전에 수립해야 합니다. \n\n##\n\n \t\n\n \t\n\n \t"

Q: How can I prevent this issue during inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment