Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release notes v2.11 #2710

Merged
Merged
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions ReleaseNotes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,54 @@
# Release Notes

## New in Release 2.11.0

Post-training Quantization:

- Features:
- (OpenVINO) Added Scale Estimation algorithm for 4-bit data-aware weights compression. The optional `scale_estimation` parameter was introduced to `nncf.compress_weights()` and can be used to minimize accuracy degradation of compressed models (note that this algorithm increases the compression time).
- (OpenVINO) Added GPTQ algorithm for 8/4-bit data-aware weights compression, supporting INT8, INT4, and NF4 data types. The optional `gptq` parameter was introduced to `nncf.compress_weights()` to enable the [GPTQ](https://arxiv.org/abs/2210.17323) algorithm.
- (OpenVINO) Added support for models with BF16 weights in the weights compression method, `nncf.compress_weights()`.
- (PyTorch) Added support of the custom modules with traced parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- (PyTorch) Added support of the custom modules with traced parameters.
- (PyTorch) Added support for quantization and weight compression of the custom modules.

- Fixes:
- (OpenVINO) Fixed incorrect node with bias determination in Fast-/BiasCorrection and ChannelAlighnment algorithms.
- (OpenVINO, PyTorch) Fixed incorrect behaviour of `nncf.compress_weights()` in case of compressed model as input.
- (OpenVINO, PyTorch) Fixed SmoothQuant algorithm to work with Split ports correctly.
- Improvements:
- (OpenVINO) Aligned resulting compression subgraphs for the `nncf.compress_weights()` in different FP precisions.
- Aligned 8-bit scheme for NPU target device with the CPU.
- Examples:
- (OpenVINO, ONNX) Updated ignored scope for YOLOv8 examples utilizing a subgraphs approach.
- Tutorials:
- [Post-Training Optimization of Stable Video Diffusion Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/stable-video-diffusion/stable-video-diffusion.ipynb)
- [Post-Training Optimization of YOLOv10 Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/yolov10-optimization/yolov10-optimization.ipynb)
- [Post-Training Optimization of LLaVA Next Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb)
- [Post-Training Optimization of S3D MIL-NCE Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/s3d-mil-nce-text-to-video-retrieval/s3d-mil-nce-text-to-video-retrieval.ipynb)
- [Post-Training Optimization of Stable Cascade Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb)

Compression-aware training:

- Features:
- (PyTorch) `nncf.quantize` method is now the recommended path for the quantization initialization for Quantization-Aware Training.
- (PyTorch) Compression modules placement in the model now can be serialized and restored with new API functions: `compressed_model.nncf.get_config()` and `nncf.torch.load_from_config`. The [documentation](/docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model is available, and Resnet18 [example](examples/quantization_aware_training/torch/resnet18) was updated to use the new API.
- Fixes:
- (PyTorch) Fixed compatibility with `torch.compile`.
- Improvements:
- (PyTorch) Base parameters were extended for the EvolutionOptimizer (LeGR algorithm part).
- (PyTorch) Improved wrapping for parameters which are not tensors.
- Examples:
- (PyTorch) Added [an example](examples/quantization_aware_training/torch/anomalib) for STFPM model from Anomalib.
- Tutorials:
- [Quantization-Sparsity Aware Training of PyTorch ResNet-50 Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/pytorch-quantization-sparsity-aware-training/pytorch-quantization-sparsity-aware-training.ipynb)

Deprecations/Removals:

- Removed extra dependencies to install backends from setup.py (like `[torch]` are `[tf]`, `[onnx]` and `[openvino]`).
- Removed `openvino-dev` dependency.

Requirements:

- Updated PyTorch (2.2.1) and Torchvision (0.18.0) versions.

## New in Release 2.10.0

Post-training Quantization:
Expand Down
Loading