Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Support for Custom Quantization #35915

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

keetrap
Copy link

@keetrap keetrap commented Jan 27, 2025

This PR adds a new feature to support custom quantization in the Transformers library.

Closes #35814

@SunMarc @MekkCyber

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@MekkCyber
Copy link
Contributor

Hi @keetrap, Thanks for PR ! It looks great and it's a very handy feature for the community ! I just left some very small nits

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the example should be here, maybe we can add some doc about it, wdyt @SunMarc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I tried to add a complete example for 8-bit quantization, and it seems to be working fine as far as I know. However, since I’m still learning, it might be better if someone with more experience could add the example.
It would be easier to add the documentation if there's a complete working example available, as we could reference that in the docs. However, if you'd prefer me to continue with my example and create the documentation based on that, I will do it. Just let me know how you'd like to proceed.

Comment on lines 3644 to 3648
try:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
except Exception:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method
# Force-set to `True` for more mem efficiency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to use an if/else statement to avoid capturing unintended exceptions, wdyt ?

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice solution, thanks for adding this. Left a couple of comments cc @ice-tong if you also want to have a look

Comment on lines 3644 to 3647
try:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
except Exception:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of a try except, you can just check if value attribute exist or not

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this.

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support register customize quantization method out-of-tree
4 participants