-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TorchFX] SmoothQuant algorithm implementation #2875
Conversation
06f948a
to
3ce7cb0
Compare
756481a
to
ff48d0a
Compare
ff48d0a
to
10a8997
Compare
Swin transformer conformance test FXSQMultiply Refereces update
class FXSQMultiply(torch.nn.Module): | ||
def __init__(self, scale: torch.Tensor): | ||
super().__init__() | ||
self._scale_value = scale | ||
|
||
def forward(self, x: torch.Tensor) -> torch.Tensor: | ||
return torch.mul(x, self._scale_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just pass a value into the model transformer? From my side, the algo backend should be only the bridge between the algorithm and backend specifics. But not the place for the critical structures that affect other parts of the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is made for the sake of flexibility and is adopted from the torch backend: we can reuse one module_insertion_transformation_builder
to insert both multiplies for the SQ, biases for channel alignment and convert nodes for the weights compression algo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But in this case, model transformers start to be dependant on the algorithms and their internal classes, not only from transformer commands. This is the main issue for me.
d79fa03
to
b150788
Compare
### Changes TorchFX SmoothQuant backend implementation * module_insertion_transformation_builder is introduced * Transformation requires names for new modules and nodes * vit_b_16 is introduced in the conformance tests ### Reason for changes To improve metrics of quantized models: swin_v2_s and vit_b_16 * To insert SQ multiply nodes to the graph * To make node names human-readable and consistent * To check sq algorithm E2E ### Related tickets openvinotoolkit#2766 ### Tests * Smooth quant test template is implemented for TorchfX backed * Conformance test: post_training_quantization/446/ is successfull * Test models check SQ multiplies for swin_v2_s and vit_b_16 models
Changes
TorchFX SmoothQuant backend implementation
Reason for changes
To improve metrics of quantized models: swin_v2_s and vit_b_16
Related tickets
#2766
Tests