[WC, PT] Store compression scale in f16 #2596

nikita-savelyevv · 2024-03-22T15:38:59Z

Changes

Store compression scale if FP16
Add type conversion to original data type after decompression

Below are the compression subgraphs for the first conv2d in mobilenet_v2 after conversion to OV, this is similar to the table presented in #2537 .

Compared to OV case, there is an additional Multiply node after the scale Multiply node. It seems to come from Batch Norm applied to the convolution. In case of PT weight compression it does not get merged into the weight as it does in OV case.

Reason for changes

Weight compression for PT backend fails when applied to model in half precision. The reason is that the scale is always in FP32, and hence decompression result is also in FP32, which conflicts with input type of FP16.

Related tickets

134063

Tests

Added test for half/full precision cases. Also added cases for different devices as it was thought that it may influence tracing in half precision.

codecov · 2024-03-22T15:41:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.91%. Comparing base (b7ba5ad) to head (3b172ab).

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2596      +/-   ##
===========================================
- Coverage    91.16%   84.91%   -6.26%     
===========================================
  Files          494      494              
  Lines        45350    45373      +23     
===========================================
- Hits         41342    38527    -2815     
- Misses        4008     6846    +2838

Files	Coverage Δ
...ion/algorithms/weight_compression/torch_backend.py	`84.10% <100.00%> (+0.10%)`	⬆️
nncf/torch/quantization/layers.py	`95.85% <100.00%> (+0.02%)`	⬆️

... and 59 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`44.14% <0.00%> (-0.01%)`	⬇️
ONNX	`34.66% <14.28%> (+<0.01%)`	⬆️
OPENVINO	`∅ <ø> (∅)`
TENSORFLOW	`30.10% <0.00%> (-0.02%)`	⬇️
TORCH	`65.93% <100.00%> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`93.14% <ø> (-0.65%)`	⬇️
torch	`93.48% <100.00%> (+<0.01%)`	⬆️
tensorflow	`93.74% <ø> (ø)`
onnx	`93.12% <ø> (+0.09%)`	⬆️
openvino	`25.75% <ø> (-68.39%)`	⬇️
ptq	`69.92% <100.00%> (-20.21%)`	⬇️

nncf/torch/quantization/layers.py

alexsu52

LGTM

#2605) ### Reason for changes Regression after #2596 .

- For weight compression, align quantization scales to always be saved in FP16 precision nevertheless the input model weights precision. (openvinotoolkit#2596, openvinotoolkit#2508) - (UX) Expose `OverflowFix`, `AdvancedSmoothQuantParameters` and `AdvancedBiasCorrectionParameters` classes to be available for import as `nncf.OverfloFix`, `nncf.AdvancedSmoothQuantParameters`, `nncf.AdvancedBiasCorrectionParameters`. (openvinotoolkit#2608, openvinotoolkit#2624)

nikita-savelyevv requested a review from a team as a code owner March 22, 2024 15:39

nikita-savelyevv requested a review from alexsu52 March 22, 2024 15:39

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ labels Mar 22, 2024

Make scale always be in f16; add type conversion after decompression

3b172ab

alexsu52 reviewed Mar 24, 2024

View reviewed changes

nncf/torch/quantization/layers.py Show resolved Hide resolved

alexsu52 approved these changes Mar 26, 2024

View reviewed changes

alexsu52 merged commit c79111b into openvinotoolkit:develop Mar 26, 2024
11 checks passed

nikita-savelyevv mentioned this pull request Mar 27, 2024

Added torch cuda test skipping in case it is run on a CPU-only machine #2605

Merged

alexsu52 pushed a commit that referenced this pull request Mar 27, 2024

Added torch cuda test skipping in case it is run on a CPU-only machine (

0696166

#2605) ### Reason for changes Regression after #2596 .

nikita-savelyevv mentioned this pull request Apr 19, 2024

Release notes 2.10 #2640

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WC, PT] Store compression scale in f16 #2596

[WC, PT] Store compression scale in f16 #2596

nikita-savelyevv commented Mar 22, 2024 •

edited

Loading

codecov bot commented Mar 22, 2024 •

edited

Loading

alexsu52 left a comment

[WC, PT] Store compression scale in f16 #2596

[WC, PT] Store compression scale in f16 #2596

Conversation

nikita-savelyevv commented Mar 22, 2024 • edited Loading

Changes

Reason for changes

Related tickets

Tests

codecov bot commented Mar 22, 2024 • edited Loading

Codecov Report

alexsu52 left a comment

Choose a reason for hiding this comment

nikita-savelyevv commented Mar 22, 2024 •

edited

Loading

codecov bot commented Mar 22, 2024 •

edited

Loading