-
Notifications
You must be signed in to change notification settings - Fork 239
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PTQ] Add support of arbitrary batch size for PTQ (#2197)
### Changes Add a new advanced bool option for quantization - `batchwise_statistics`. When set to _True_ then statistics collection for supported algorithms (see below) are calculated with the assumption that the 0-axis of a tensor is a batch axis. If the value is False then statistics collection for algorithms is calculated with an assumption that the tensor has no batch axis. If set to None statistics collection logic adapts based on the batch_size of the provided dataset. These adjustments in statistical computation apply specifically to MinMax, ChannelAlighnment algorithms. During the validation of proposed changes on a wide scope of models, some limitations were observed - if a model contains specific operations that output in a way that a tensor batch axis starts to contain no batch meaning anymore, then the statistics after such operations are collected not precisely. The handling of such cases is introduced and determined by a warning message to a user with a recommendation using batch size = 1 for a specific model or set to False `batchwise_statistics` option. The torch sample for mobilenet_v2 was updated with `batch_size=128` value with a new recalculated `subset_size`. The conformance test was updated with new options `batch_size` and `dynamic_batch_shape`. Calibrate.py was updated with a new option `batch_size`. Algorithm support batch_size > 1: Algorithm | Do results depend on batch_size? | Comments -- | -- | -- MinMax | relatively depends | Relatively means that results are dependant on the correctness of the utilized assumption that batch lays on the 0-axis. To overcome there is a need to have batch axis determination algorithm FastBiascCorrection | Yes | Incorrect statistics calculation with no regarding batch axis in an aggregator. Need to have batch axis determination algorithm BiasCorrection | Yes | Incorrect statistics calculation with no regarding batch axis in an aggregator. Need to have batch axis determination algorithm ChannelAlighnment | No | Checked on models from conformance test: **mobilenet_v2, mobilenet_v3** SmoothQuant | No | Checked on models from conformance test: **levit_128, visformer_small** PostTrainingQuantization | Yes | Need to have batch axis determination algorithm ### Reason for changes Speeding up statistics collection. SpeedUp on mobilenet_v2 sample (local measurments): Backend | bs=1 (sec) | bs=16 (sec) | bs=128 (sec) -- | -- | -- | -- Torch | 24 | 4 | 4 Torch CUDA | 20 | 1 | 1 OpenVINO | 9 | 4 | 5 ONNX | 17 | 11 | 12 Extend usage scenarios. ### Related tickets 121650 ### Tests Old tests were updated accordingly. New test added: test_tensor_collector_batch_size test_min_max
- Loading branch information
Showing
72 changed files
with
1,395 additions
and
724 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.