[HWConfig] narrow_range parameter is introduced in hardware config #3196

daniil-lyakhov · 2025-01-17T18:38:49Z

On top of #3232

Changes

Narrow range parameter is moved to hardware config
Embedding / embedding bag nodes propagates the narrow_range attribute of the activation quantizer up to the weight quantizer

Reason for changes

To extend the set of possible quantization configuration (like weights + symmetric + narrow_range=False) required for the some runtimes (ex: XNNPACK)

Related tickets

Tests

tests/cross_fw/test_templates/test_calculate_quantizer_parameters.py is updated to check all possible combination of existing qconfigs with narrow_range True/False
tests/common/quantization/test_quantizer_propagation_solver.py is updated to check requantization rule with narrow_range and the rule of configuration merging with different narrow_range parameters
tests/common/quantization/test_quantizer_propagation_graph.py is updated to check subsequent quantizers with different narrow_range values do not merge
reference quantizers scales are updated to reflect the fix of embedding weights quantization

post_training_quantization/586/ - Passed
job/weekly/job/ubuntu20_eval/245/ (+ job/ubuntu20_eval/246/) Passed
eval_tf/461/ - Passed
torch_weekly/100/ - Passed
job/nightly/job/torch_nightly/444/ - Passed

alexsu52 · 2025-01-28T12:33:35Z

nncf/common/hardware/configs/cpu.json

@@ -288,7 +293,7 @@
        {
            "type": "Embedding",
            "quantization": {
-                "weights": ["q8_w_sym", "q8_w_asym"]
+                "weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"]


I'm not sure that LPT supports q8_a_ch for embedding. Please, double check.

As far as I understand, this is applicable only for embedding -> depthwise convolution sub-graph. However, I did not meet such sub-graph in the real word.

Suggested change

"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"]

"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_ch"]

I'm not sure that even a per-tensor quantization scheme is applicable here for the Embedding layer since it contains sensitive weights. We need to verify that this scheme does not introduce performance/accuracy regressions on the known cases.

versa model validation after the rebase:

versa INT8 roc_auc_score Embedding weights config narrow range Throughput, FPS Gather execType Convs execType

develop 91.54 'B:8 M:S SGN:S PC:N' TRUE 642.18 jit_avx512_i8 brgconv_avx512_i8

daniil-lyakhov:dl/narrow_range_to_qconfig 91.54 B:8 M:S SGN:S PC:N NR:N' FALSE 641.19 jit_avx512_i8 brgconv_avx512_i8

alexsu52 · 2025-01-28T12:34:39Z

nncf/common/hardware/configs/template.json

+                /*
+                 * Narrow range: should NNCF use 2**num_bits quants or 2**num_bits - 1
+                 */
+                "narrow_range": false
            },
            "q8_sym_tnr_-128_127": { // Alias name for set of hyperparameters
                "bits": 8, // Number of quantization bits
                "mode": "symmetric", // Quantization mode
                "granularity": "pertensor", // Granularity: one scale for output tensor
                "level_low": -128, // Low quantization level


Does NNCF support for level_low and level_high?

Isn't it redundant, since other params define how to calculate them?

https://github.com/openvinotoolkit/nncf/blob/develop/nncf/common/hardware/config.py#L160

nncf/common/hardware/configs/npu.json

nncf/common/quantization/quantizer_setup.py

nncf/config/schemata/defaults.py

nncf/common/quantization/quantizer_propagation/graph.py

daniil-lyakhov · 2025-01-31T14:56:35Z

tests/openvino/native/data/2024.1/reference_scales/IntegerModel_performance.json

-        "input_low": -0.9350724220275879,
+        "input_low": -0.9424352049827576,
        "input_high": 0.9350724220275879,
-        "output_low": -0.9350724220275879,
+        "output_low": -0.9424352049827576,
        "output_high": 0.9350724220275879


github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF Common Pull request that updates NNCF Common NNCF PTQ Pull requests that updates NNCF PTQ labels Jan 17, 2025

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 4 times, most recently from 07a6520 to a9a0c19 Compare January 17, 2025 18:53

github-actions bot added the NNCF ONNX Pull requests that updates NNCF ONNX label Jan 20, 2025

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from 7134fcb to c8b1761 Compare January 20, 2025 17:15

github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Jan 21, 2025

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 4 times, most recently from 0b064a7 to 0733ac3 Compare January 21, 2025 15:56

daniil-lyakhov requested review from AlexanderDokuchaev and KodiaqQ January 22, 2025 12:26

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from 0733ac3 to 10267b0 Compare January 22, 2025 14:50

daniil-lyakhov requested review from ljaljushkin and alexsu52 January 23, 2025 16:31

daniil-lyakhov marked this pull request as ready for review January 27, 2025 16:06

daniil-lyakhov requested a review from a team as a code owner January 27, 2025 16:06

alexsu52 assigned KodiaqQ Jan 28, 2025

alexsu52 reviewed Jan 28, 2025

View reviewed changes

KodiaqQ reviewed Jan 28, 2025

View reviewed changes

nncf/common/hardware/configs/npu.json Outdated Show resolved Hide resolved

KodiaqQ reviewed Jan 28, 2025

View reviewed changes

nncf/common/quantization/quantizer_setup.py Show resolved Hide resolved

KodiaqQ reviewed Jan 28, 2025

View reviewed changes

nncf/config/schemata/defaults.py Show resolved Hide resolved

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from a2e42d5 to 7257b2e Compare January 30, 2025 13:13

daniil-lyakhov commented Jan 30, 2025

View reviewed changes

nncf/common/quantization/quantizer_propagation/graph.py Outdated Show resolved Hide resolved

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 2 times, most recently from 4b0dc46 to 4d3a892 Compare January 31, 2025 13:38

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch 2 times, most recently from 987c51d to ea5d0fd Compare January 31, 2025 14:52

daniil-lyakhov commented Jan 31, 2025

View reviewed changes

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from ea5d0fd to a85f7b9 Compare January 31, 2025 15:14

daniil-lyakhov added 4 commits January 31, 2025 16:19

[HWConfig] narrow_range parameter is introduced in hardware config

e437d12

Embedding qconfig list is extended for CPU devices / tests fixes

eecc2cf

References update

ba113c9

More references

a85f7b9

daniil-lyakhov force-pushed the dl/narrow_range_to_qconfig branch from e7a1453 to cfedaf7 Compare January 31, 2025 15:24

Default value is reused across nncf

cfedaf7

daniil-lyakhov requested review from KodiaqQ and alexsu52 January 31, 2025 15:42

mark EbmerddingBag with q8_w_sym_any_nr

403fb59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HWConfig] narrow_range parameter is introduced in hardware config #3196

[HWConfig] narrow_range parameter is introduced in hardware config #3196

daniil-lyakhov commented Jan 17, 2025 •

edited

Loading

alexsu52 Jan 28, 2025

KodiaqQ Jan 28, 2025

This comment was marked as outdated.

daniil-lyakhov Jan 31, 2025

alexsu52 Jan 28, 2025

ljaljushkin Jan 28, 2025 •

edited

Loading

KodiaqQ Jan 28, 2025

daniil-lyakhov Jan 31, 2025

	"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"]
	"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_ch"]

versa INT8	roc_auc_score	Embedding weights config	narrow range	Throughput, FPS	Gather execType	Convs execType
develop	91.54	'B:8 M:S SGN:S PC:N'	TRUE	642.18	jit_avx512_i8	brgconv_avx512_i8
daniil-lyakhov:dl/narrow_range_to_qconfig	91.54	B:8 M:S SGN:S PC:N NR:N'	FALSE	641.19	jit_avx512_i8	brgconv_avx512_i8

[HWConfig] narrow_range parameter is introduced in hardware config #3196

Are you sure you want to change the base?

[HWConfig] narrow_range parameter is introduced in hardware config #3196

Conversation

daniil-lyakhov commented Jan 17, 2025 • edited Loading

Changes

Reason for changes

Related tickets

Tests

alexsu52 Jan 28, 2025

Choose a reason for hiding this comment

KodiaqQ Jan 28, 2025

Choose a reason for hiding this comment

This comment was marked as outdated.

daniil-lyakhov Jan 31, 2025

Choose a reason for hiding this comment

alexsu52 Jan 28, 2025

Choose a reason for hiding this comment

ljaljushkin Jan 28, 2025 • edited Loading

Choose a reason for hiding this comment

KodiaqQ Jan 28, 2025

Choose a reason for hiding this comment

daniil-lyakhov Jan 31, 2025

Choose a reason for hiding this comment

daniil-lyakhov commented Jan 17, 2025 •

edited

Loading

ljaljushkin Jan 28, 2025 •

edited

Loading