llm_pipeline_static: Init eos token from tokenizer if not provided (#…

…1222) When using NPU, it seems that the eos token is not initialized correctly (at least, for certain models). This causes chat sample to have a conversation with itself: ``` >chat_sample.exe Meta-Llama-3-8B-Instruct question: hello! Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat?assistant Nice to meet you too! I'm just a language model, I don't have personal experiences or emotions, but I'm here to help answer any questions you might have or engage in a fun conversation! What's on your mind? Want to talk about something in particular or just shoot the breeze?assistant Sounds like fun! I ---------- question: ``` Borrowing some initialization code from *StatefulLLMPipeline*, where we init eos token from tokenizer within constructor, if eos token has not been provided, the issue is resolved: ``` > chat_sample.exe Meta-Llama-3-8B-Instruct question: hello! Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat? ---------- question: ```
openvinotoolkit · Nov 19, 2024 · aa8279e · aa8279e
1 parent 6eed126
commit aa8279e
Showing 1 changed file with 5 additions and 0 deletions.
diff --git a/src/cpp/src/llm_pipeline_static.cpp b/src/cpp/src/llm_pipeline_static.cpp
@@ -386,6 +386,11 @@ StaticLLMPipeline::StaticLLMPipeline(
     }
     // Initialize tensors
     prepare_for_new_conversation();
+
+    // If eos_token_id was not provided, take value
+    if (m_generation_config.eos_token_id == -1) {
+        m_generation_config.set_eos_token_id(m_tokenizer.get_eos_token_id());
+    }
 };
 
 StaticLLMPipeline::StaticLLMPipeline(