[TorchFX] SmoothQuant algorithm implementation #2875

daniil-lyakhov · 2024-08-06T17:10:32Z

Changes

TorchFX SmoothQuant backend implementation

module_insertion_transformation_builder is introduced
Transformation requires names for new modules and nodes
vit_b_16 is introduced in the conformance tests

Reason for changes

To improve metrics of quantized models: swin_v2_s and vit_b_16

To insert SQ multiply nodes to the graph
To make node names human-readable and consistent
To check sq algorithm E2E

Related tickets

#2766

Tests

Smooth quant test template is implemented for TorchfX backed
Conformance test: post_training_quantization/446/ is successfull
Test models check SQ multiplies for swin_v2_s and vit_b_16 models

Swin transformer conformance test FXSQMultiply Refereces update

nncf/experimental/torch/fx/statistics/aggregator.py

nncf/experimental/torch/fx/transformations.py

nncf/quantization/algorithms/smooth_quant/torch_fx_backend.py

KodiaqQ · 2024-08-14T06:46:38Z

nncf/quantization/algorithms/smooth_quant/torch_fx_backend.py

+class FXSQMultiply(torch.nn.Module):
+    def __init__(self, scale: torch.Tensor):
+        super().__init__()
+        self._scale_value = scale
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        return torch.mul(x, self._scale_value)


Why can't we just pass a value into the model transformer? From my side, the algo backend should be only the bridge between the algorithm and backend specifics. But not the place for the critical structures that affect other parts of the pipeline.

This is made for the sake of flexibility and is adopted from the torch backend: we can reuse one module_insertion_transformation_builder to insert both multiplies for the SQ, biases for channel alignment and convert nodes for the weights compression algo.

I see. But in this case, model transformers start to be dependant on the algorithms and their internal classes, not only from transformer commands. This is the main issue for me.

nncf/quantization/algorithms/smooth_quant/torch_fx_backend.py

tests/post_training/data/ptq_reference_data.yaml

tests/post_training/test_templates/test_smooth_quant.py

tests/torch/fx/test_smooth_quant.py

nncf/experimental/torch/fx/nncf_graph_builder.py

tests/torch/fx/test_smooth_quant.py

nncf/experimental/torch/fx/nncf_graph_builder.py

nncf/experimental/torch/fx/statistics/aggregator.py

### Changes TorchFX SmoothQuant backend implementation * module_insertion_transformation_builder is introduced * Transformation requires names for new modules and nodes * vit_b_16 is introduced in the conformance tests ### Reason for changes To improve metrics of quantized models: swin_v2_s and vit_b_16 * To insert SQ multiply nodes to the graph * To make node names human-readable and consistent * To check sq algorithm E2E ### Related tickets openvinotoolkit#2766 ### Tests * Smooth quant test template is implemented for TorchfX backed * Conformance test: post_training_quantization/446/ is successfull * Test models check SQ multiplies for swin_v2_s and vit_b_16 models

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch experimental NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels Aug 6, 2024

daniil-lyakhov force-pushed the dl/fx/sq branch 2 times, most recently from 06f948a to 3ce7cb0 Compare August 8, 2024 18:15

daniil-lyakhov marked this pull request as ready for review August 9, 2024 09:16

daniil-lyakhov requested a review from a team as a code owner August 9, 2024 09:16

daniil-lyakhov force-pushed the dl/fx/sq branch 7 times, most recently from 756481a to ff48d0a Compare August 13, 2024 12:38

daniil-lyakhov requested review from KodiaqQ and AlexanderDokuchaev August 13, 2024 12:42

daniil-lyakhov force-pushed the dl/fx/sq branch from ff48d0a to 10a8997 Compare August 13, 2024 13:30

daniil-lyakhov added 2 commits August 13, 2024 15:32

WIP

55e653b

Smooth quant algorithm implementation

10a8997

Swin transformer conformance test FXSQMultiply Refereces update

daniil-lyakhov assigned AlexanderDokuchaev Aug 13, 2024

KodiaqQ reviewed Aug 14, 2024

View reviewed changes

Comments

7ecfcb9

daniil-lyakhov requested a review from KodiaqQ August 14, 2024 11:53

AlexanderDokuchaev requested changes Aug 14, 2024

View reviewed changes

daniil-lyakhov requested a review from AlexanderDokuchaev August 14, 2024 14:04

daniil-lyakhov force-pushed the dl/fx/sq branch from d79fa03 to b150788 Compare August 14, 2024 14:28

AlexanderDokuchaev approved these changes Aug 14, 2024

View reviewed changes

Comments

b150788

KodiaqQ merged commit 7744ebf into openvinotoolkit:develop Aug 16, 2024
13 checks passed

alexsu52 mentioned this pull request Oct 9, 2024

[TorchFX] Torch FX/PyTorch 2 Export Quantization #2766

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TorchFX] SmoothQuant algorithm implementation #2875

[TorchFX] SmoothQuant algorithm implementation #2875

daniil-lyakhov commented Aug 6, 2024 •

edited

Loading

KodiaqQ Aug 14, 2024

daniil-lyakhov Aug 14, 2024

KodiaqQ Aug 14, 2024

[TorchFX] SmoothQuant algorithm implementation #2875

[TorchFX] SmoothQuant algorithm implementation #2875

Conversation

daniil-lyakhov commented Aug 6, 2024 • edited Loading

Changes

Reason for changes

Related tickets

Tests

KodiaqQ Aug 14, 2024

Choose a reason for hiding this comment

daniil-lyakhov Aug 14, 2024

Choose a reason for hiding this comment

KodiaqQ Aug 14, 2024

Choose a reason for hiding this comment

daniil-lyakhov commented Aug 6, 2024 •

edited

Loading