-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unknown quantization type, got fp8 #35471
Comments
I got the same issue today while trying. 👎🏼 Did you find any fix? |
No, was hoping one of the maintainers could help. |
Deepseek is not supported directly by transformers, only through custom code. The issue here is that they added a quantization_config in config.json which triggered some checks and the thing is that we don't support their fp8 method yet (must be used in vllm). One thing you can try is to remove that attribute in the config.json ! Also, I'm a bit surprised that it works with 4.37.2, can you double check ? |
4.37.2 did not work. I got further along with it, but it did not work at the end. Is there any potential fixes I can try besides removing that attribute in the config.json? I will see if I can work a compatibility fix and open a PR. |
We can potentially just skip the quantization step and trigger a |
hi, i remove that attribute in the config.json, and then, i get error:Some weights of the model checkpoint at /root/DeepSeek-V3 were not used when initializing DeepseekV3ForCausalLM: ['model.layers.0.mlp.down_proj.weight_scale_inv', 'model.layers.0.mlp.gate_proj.weight_scale_inv', |
@SunMarc I got the same error when I tried this too |
Since this is a custom code, you will have better chance to fix this issue by trying to reach out to the author in the community section of their model. |
I edited the config.json file by removing or adjusting parameters related to quantization and custom weight scaling. This allowed the DeepSeek model to load correctly. And you can try editing the config.json file directly to resolve the issue.
|
Is this something we can PR to main branch? |
I spent the whole day trying to make it work, even going as far as replacing this parameter in config.json. |
#35926 should be supported soon! |
Will it support Deepseek-R1? |
@AbyssGaze how did you modify the config.json file to fix this issue? the codes you posted seem not for the config.json file. I would appreciate if you could clarify your fix |
Also having the same issue with Deepseek-R1 |
The following PR should do that. Instead of raising an error, we will just ignore the quantization config. LMK if this helps |
|
I think there is an issue with different transformer versions. I tried 4.44/4.48 and it didn't work. Then I tried 4.46.1 and it is working well without any issues and I can download deepseek-ai/DeepSeek-R1. |
@SunMarc I simply removed the quantization part in the config.json file, and then it works! Tricky thing is that downloading the r1 model quite slow. I suppose downloading will take a couple of hrs. |
System Info
transformers
version: 4.47.1Who can help?
@SunMarc @MekkCyber
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Issue arises when using AutoModelForCasualLM.from_pretrained()
The model used is
"deepseek-ai/DeepSeek-V3"
File "/Users/ruidazeng/Demo/chatbot.py", line 13, in init
self.model = AutoModelForCausalLM.from_pretrained(
File "/opt/anaconda3/envs/gaming-bot/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
return model_class.from_pretrained(
File "/opt/anaconda3/envs/gaming-bot/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3659, in from_pretrained
config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
File "/opt/anaconda3/envs/gaming-bot/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 173, in merge_quantization_configs
quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
File "/opt/anaconda3/envs/gaming-bot/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 97, in from_dict
raise ValueError(
ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet']
Expected behavior
To be able to run Deepseek-R1
The text was updated successfully, but these errors were encountered: