From 9200a22fad8c79f68919f40a21c54179d7354504 Mon Sep 17 00:00:00 2001 From: Nikita Malinin Date: Thu, 13 Jun 2024 09:05:53 +0200 Subject: [PATCH] Release notes v2.11 (#2710) ### Changes Added v2.10.0 template; ### Reason for changes Upcoming release; ### Related tickets 142565; #### For the contributors: Please add your changes (as the commit to the branch) to the list according to the template and previous notes; Do not add tests-related notes; Provide the list of the PRs (for all your notes) in the comment for the discussion; --------- Co-authored-by: Nikita Savelyev Co-authored-by: Liubov Talamanova Co-authored-by: Alexander Dokuchaev Co-authored-by: Daniil Lyakhov Co-authored-by: andreyanufr Co-authored-by: Aleksei Kashapov Co-authored-by: Alexander Suslov Co-authored-by: Lyalyushkin Nikolay --- ReleaseNotes.md | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 1255d71eece..bc6108d349c 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -1,5 +1,54 @@ # Release Notes +## New in Release 2.11.0 + +Post-training Quantization: + +- Features: + - (OpenVINO) Added Scale Estimation algorithm for 4-bit data-aware weights compression. The optional `scale_estimation` parameter was introduced to `nncf.compress_weights()` and can be used to minimize accuracy degradation of compressed models (note that this algorithm increases the compression time). + - (OpenVINO) Added GPTQ algorithm for 8/4-bit data-aware weights compression, supporting INT8, INT4, and NF4 data types. The optional `gptq` parameter was introduced to `nncf.compress_weights()` to enable the [GPTQ](https://arxiv.org/abs/2210.17323) algorithm. + - (OpenVINO) Added support for models with BF16 weights in the weights compression method, `nncf.compress_weights()`. + - (PyTorch) Added support for quantization and weight compression of the custom modules. +- Fixes: + - (OpenVINO) Fixed incorrect node with bias determination in Fast-/BiasCorrection and ChannelAlighnment algorithms. + - (OpenVINO, PyTorch) Fixed incorrect behaviour of `nncf.compress_weights()` in case of compressed model as input. + - (OpenVINO, PyTorch) Fixed SmoothQuant algorithm to work with Split ports correctly. +- Improvements: + - (OpenVINO) Aligned resulting compression subgraphs for the `nncf.compress_weights()` in different FP precisions. + - Aligned 8-bit scheme for NPU target device with the CPU. +- Examples: + - (OpenVINO, ONNX) Updated ignored scope for YOLOv8 examples utilizing a subgraphs approach. +- Tutorials: + - [Post-Training Optimization of Stable Video Diffusion Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/stable-video-diffusion/stable-video-diffusion.ipynb) + - [Post-Training Optimization of YOLOv10 Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/yolov10-optimization/yolov10-optimization.ipynb) + - [Post-Training Optimization of LLaVA Next Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb) + - [Post-Training Optimization of S3D MIL-NCE Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/s3d-mil-nce-text-to-video-retrieval/s3d-mil-nce-text-to-video-retrieval.ipynb) + - [Post-Training Optimization of Stable Cascade Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb) + +Compression-aware training: + +- Features: + - (PyTorch) `nncf.quantize` method is now the recommended path for the quantization initialization for Quantization-Aware Training. + - (PyTorch) Compression modules placement in the model now can be serialized and restored with new API functions: `compressed_model.nncf.get_config()` and `nncf.torch.load_from_config`. The [documentation](/docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model is available, and Resnet18 [example](examples/quantization_aware_training/torch/resnet18) was updated to use the new API. +- Fixes: + - (PyTorch) Fixed compatibility with `torch.compile`. +- Improvements: + - (PyTorch) Base parameters were extended for the EvolutionOptimizer (LeGR algorithm part). + - (PyTorch) Improved wrapping for parameters which are not tensors. +- Examples: + - (PyTorch) Added [an example](examples/quantization_aware_training/torch/anomalib) for STFPM model from Anomalib. +- Tutorials: + - [Quantization-Sparsity Aware Training of PyTorch ResNet-50 Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/pytorch-quantization-sparsity-aware-training/pytorch-quantization-sparsity-aware-training.ipynb) + +Deprecations/Removals: + +- Removed extra dependencies to install backends from setup.py (like `[torch]` are `[tf]`, `[onnx]` and `[openvino]`). +- Removed `openvino-dev` dependency. + +Requirements: + +- Updated PyTorch (2.2.1) and Torchvision (0.18.0) versions. + ## New in Release 2.10.0 Post-training Quantization: