Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WC, PT] Store compression scale in f16 #2596

Conversation

nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Mar 22, 2024

Changes

  • Store compression scale if FP16
  • Add type conversion to original data type after decompression

Below are the compression subgraphs for the first conv2d in mobilenet_v2 after conversion to OV, this is similar to the table presented in #2537 .
image
Compared to OV case, there is an additional Multiply node after the scale Multiply node. It seems to come from Batch Norm applied to the convolution. In case of PT weight compression it does not get merged into the weight as it does in OV case.

Reason for changes

Weight compression for PT backend fails when applied to model in half precision. The reason is that the scale is always in FP32, and hence decompression result is also in FP32, which conflicts with input type of FP16.

Related tickets

134063

Tests

Added test for half/full precision cases. Also added cases for different devices as it was thought that it may influence tracing in half precision.

@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner March 22, 2024 15:39
@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ labels Mar 22, 2024
Copy link

codecov bot commented Mar 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.91%. Comparing base (b7ba5ad) to head (3b172ab).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2596      +/-   ##
===========================================
- Coverage    91.16%   84.91%   -6.26%     
===========================================
  Files          494      494              
  Lines        45350    45373      +23     
===========================================
- Hits         41342    38527    -2815     
- Misses        4008     6846    +2838     
Files Coverage Δ
...ion/algorithms/weight_compression/torch_backend.py 84.10% <100.00%> (+0.10%) ⬆️
nncf/torch/quantization/layers.py 95.85% <100.00%> (+0.02%) ⬆️

... and 59 files with indirect coverage changes

Flag Coverage Δ
COMMON 44.14% <0.00%> (-0.01%) ⬇️
ONNX 34.66% <14.28%> (+<0.01%) ⬆️
OPENVINO ∅ <ø> (∅)
TENSORFLOW 30.10% <0.00%> (-0.02%) ⬇️
TORCH 65.93% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 93.14% <ø> (-0.65%) ⬇️
torch 93.48% <100.00%> (+<0.01%) ⬆️
tensorflow 93.74% <ø> (ø)
onnx 93.12% <ø> (+0.09%) ⬆️
openvino 25.75% <ø> (-68.39%) ⬇️
ptq 69.92% <100.00%> (-20.21%) ⬇️

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexsu52 alexsu52 merged commit c79111b into openvinotoolkit:develop Mar 26, 2024
11 checks passed
alexsu52 pushed a commit that referenced this pull request Mar 27, 2024
nikita-savelyevv added a commit to KodiaqQ/nncf that referenced this pull request Apr 19, 2024
- For weight compression, align quantization scales to always be saved in FP16 precision nevertheless the input model weights precision. (openvinotoolkit#2596, openvinotoolkit#2508)
- (UX) Expose `OverflowFix`, `AdvancedSmoothQuantParameters` and `AdvancedBiasCorrectionParameters` classes to be available for import as `nncf.OverfloFix`, `nncf.AdvancedSmoothQuantParameters`, `nncf.AdvancedBiasCorrectionParameters`. (openvinotoolkit#2608, openvinotoolkit#2624)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants