Added Support for Custom Quantization #35915

keetrap · 2025-01-27T17:49:16Z

This PR adds a new feature to support custom quantization in the Transformers library.

Did you read the contributor guideline,
Was this discussed/approved via a Github issue

… into Custom_Quantization

HuggingFaceDocBuilderDev · 2025-01-28T11:01:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MekkCyber · 2025-01-28T11:05:54Z

Hi @keetrap, Thanks for PR ! It looks great and it's a very handy feature for the community ! I just left some very small nits

MekkCyber · 2025-01-28T10:40:43Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


I am not sure if the example should be here, maybe we can add some doc about it, wdyt @SunMarc

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

Hey, I tried to add a complete example for 8-bit quantization, and it seems to be working fine as far as I know. However, since I’m still learning, it might be better if someone with more experience could add the example.
It would be easier to add the documentation if there's a complete working example available, as we could reference that in the docs. However, if you'd prefer me to continue with my example and create the documentation based on that, I will do it. Just let me know how you'd like to proceed.

MekkCyber · 2025-01-28T11:04:27Z

src/transformers/modeling_utils.py

+            try:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
+            except Exception:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method
            # Force-set to `True` for more mem efficiency


Maybe it's better to use an if/else statement to avoid capturing unintended exceptions, wdyt ?

SunMarc

That's a nice solution, thanks for adding this. Left a couple of comments cc @ice-tong if you also want to have a look

SunMarc · 2025-01-28T11:19:09Z

src/transformers/modeling_utils.py

+            try:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
+            except Exception:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method


instead of a try except, you can just check if value attribute exist or not

I will update this.

SunMarc · 2025-01-28T11:24:01Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

SunMarc · 2025-01-28T11:30:17Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

keetrap added 5 commits January 27, 2025 23:08

Added Support for Custom Quantization

644f9e2

Merge branch 'main' into Custom_Quantization

9a59495

Update code

0b7cd98

Merge branch 'Custom_Quantization' of github.com:keetrap/transformers…

1a44b5f

… into Custom_Quantization

code reformatted

7a9ce1f

MekkCyber reviewed Jan 28, 2025

View reviewed changes

SunMarc reviewed Jan 28, 2025

View reviewed changes

keetrap added 2 commits January 28, 2025 17:18

Updated Changes

89d102a

Updated Changes

de6c7e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Support for Custom Quantization #35915

Added Support for Custom Quantization #35915

keetrap commented Jan 27, 2025

HuggingFaceDocBuilderDev commented Jan 28, 2025

MekkCyber commented Jan 28, 2025

MekkCyber Jan 28, 2025

SunMarc Jan 28, 2025

SunMarc Jan 28, 2025

keetrap Jan 28, 2025

MekkCyber Jan 28, 2025

SunMarc left a comment •

edited

Loading

SunMarc Jan 28, 2025

keetrap Jan 28, 2025

SunMarc Jan 28, 2025

SunMarc Jan 28, 2025

Added Support for Custom Quantization #35915

Are you sure you want to change the base?

Added Support for Custom Quantization #35915

Conversation

keetrap commented Jan 27, 2025

HuggingFaceDocBuilderDev commented Jan 28, 2025

MekkCyber commented Jan 28, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SunMarc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SunMarc left a comment •

edited

Loading