Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify perplexity script #108

Closed
wants to merge 2 commits into from
Closed

Modify perplexity script #108

wants to merge 2 commits into from

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Mar 5, 2024

What does this PR do ?

This PR changes the script to calculate the perplexity. This perplexity calculation is compatible with the one in llama.cpp, so we can compare the results with ggml model. See the following thread for more information. I used it a lot for calculating the perplexity of quantized models such as awq, gptq.

With this script, we get the correct perplexity for gemma or mistral. cc @younesbelkada

@SunMarc SunMarc requested a review from dacorvo March 5, 2024 21:42
@@ -64,7 +275,7 @@ def perplexity(
stride: int = 512,
):
dtype = torch.float32 if device.type == "cpu" else torch.float16
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=False)
Copy link
Member Author

@SunMarc SunMarc Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to False since the slow tokenizer is better at processing long string. I only takes a few seconds to process the dataset this way compared to the fast tokenizer (default)

@dacorvo
Copy link
Collaborator

dacorvo commented Mar 6, 2024

Merged as #110 that removes also the old code.

@dacorvo dacorvo closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants