Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HWConfig] narrow_range parameter is introduced in hardware config #3196

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

daniil-lyakhov
Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov commented Jan 17, 2025

On top of #3232

Changes

  • Narrow range parameter is moved to hardware config
  • Embedding / embedding bag nodes propagates the narrow_range attribute of the activation quantizer up to the weight quantizer

Reason for changes

  • To extend the set of possible quantization configuration (like weights + symmetric + narrow_range=False) required for the some runtimes (ex: XNNPACK)

Related tickets

Tests

  • tests/cross_fw/test_templates/test_calculate_quantizer_parameters.py is updated to check all possible combination of existing qconfigs with narrow_range True/False
  • tests/common/quantization/test_quantizer_propagation_solver.py is updated to check requantization rule with narrow_range and the rule of configuration merging with different narrow_range parameters
  • tests/common/quantization/test_quantizer_propagation_graph.py is updated to check subsequent quantizers with different narrow_range values do not merge
  • reference quantizers scales are updated to reflect the fix of embedding weights quantization

post_training_quantization/586/ - Passed
job/weekly/job/ubuntu20_eval/245/ (+ job/ubuntu20_eval/246/) Passed
eval_tf/461/ - Passed
torch_weekly/100/ - Passed
job/nightly/job/torch_nightly/444/ - Passed

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF Common Pull request that updates NNCF Common NNCF PTQ Pull requests that updates NNCF PTQ labels Jan 17, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 4 times, most recently from 07a6520 to a9a0c19 Compare January 17, 2025 18:53
@github-actions github-actions bot added the NNCF ONNX Pull requests that updates NNCF ONNX label Jan 20, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from 7134fcb to c8b1761 Compare January 20, 2025 17:15
@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Jan 21, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 4 times, most recently from 0b064a7 to 0733ac3 Compare January 21, 2025 15:56
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from 0733ac3 to 10267b0 Compare January 22, 2025 14:50
@daniil-lyakhov daniil-lyakhov marked this pull request as ready for review January 27, 2025 16:06
@daniil-lyakhov daniil-lyakhov requested a review from a team as a code owner January 27, 2025 16:06
@@ -288,7 +293,7 @@
{
"type": "Embedding",
"quantization": {
"weights": ["q8_w_sym", "q8_w_asym"]
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that LPT supports q8_a_ch for embedding. Please, double check.

As far as I understand, this is applicable only for embedding -> depthwise convolution sub-graph. However, I did not meet such sub-graph in the real word.

Suggested change
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"]
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_ch"]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that even a per-tensor quantization scheme is applicable here for the Embedding layer since it contains sensitive weights. We need to verify that this scheme does not introduce performance/accuracy regressions on the known cases.

This comment was marked as outdated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

versa model validation after the rebase:

versa INT8 roc_auc_score Embedding weights config narrow range Throughput, FPS Gather execType Convs execType
develop 91.54 'B:8 M:S SGN:S PC:N' TRUE 642.18 jit_avx512_i8 brgconv_avx512_i8
daniil-lyakhov:dl/narrow_range_to_qconfig 91.54 B:8 M:S SGN:S PC:N NR:N' FALSE 641.19 jit_avx512_i8 brgconv_avx512_i8

/*
* Narrow range: should NNCF use 2**num_bits quants or 2**num_bits - 1
*/
"narrow_range": false
},
"q8_sym_tnr_-128_127": { // Alias name for set of hyperparameters
"bits": 8, // Number of quantization bits
"mode": "symmetric", // Quantization mode
"granularity": "pertensor", // Granularity: one scale for output tensor
"level_low": -128, // Low quantization level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does NNCF support for level_low and level_high?

Copy link
Contributor

@ljaljushkin ljaljushkin Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it redundant, since other params define how to calculate them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from a2e42d5 to 7257b2e Compare January 30, 2025 13:13
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 2 times, most recently from 4b0dc46 to 4d3a892 Compare January 31, 2025 13:38
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 2 times, most recently from 987c51d to ea5d0fd Compare January 31, 2025 14:52
Comment on lines -185 to 188
"input_low": -0.9350724220275879,
"input_low": -0.9424352049827576,
"input_high": 0.9350724220275879,
"output_low": -0.9350724220275879,
"output_low": -0.9424352049827576,
"output_high": 0.9350724220275879
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-01-31 155440

@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from ea5d0fd to a85f7b9 Compare January 31, 2025 15:14
@daniil-lyakhov daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from e7a1453 to cfedaf7 Compare January 31, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants