You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems this tokenizer only supports one special token "<|endoftext|>".
Does it support other additional special tokens? For instatnce the ones we added in special_tokens_map.json,
like "<|user|>", "<|assistant|>", "<s>", "</s>" and "<unk>"?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi there~
I would also like to ask about adding special tokens.
The case is that for some models, such as Qwen1.5, the special tokens are not in the vocab.json or merges.txt at first.
They seem to be added later in huggingface Rust tokenizer implementation.
Does this repo also support this adding special tokens feature? Thank you.
Hello @wangkuiyi ,
It seems this tokenizer only supports one special token "<|endoftext|>".
Does it support other additional special tokens? For instatnce the ones we added in special_tokens_map.json,
like
"<|user|>", "<|assistant|>", "<s>", "</s>" and "<unk>"
?Thanks!
The text was updated successfully, but these errors were encountered: