diff --git a/ReleaseNotes.md b/ReleaseNotes.md index c84ba2df157..5807de5a794 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -1,5 +1,65 @@ # Release Notes +## New in Release 2.15.0 + +Post-training Quantization: + +- Breaking changes: + - ... +- General: + - ... +- Features: + - (TorchFX, Experimental) Preview support for the new `quantize_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer` and the `X86InductorQuantizer` quantizers. `quantize_pt2e` API utilizes `MinMax` algorithm statistic collectors, as well as `SmoothQuant`, `BiasCorrection` and `FastBiasCorrection` Post-Training Quantization algorithms. + - (TensorFlow) The `nncf.quantize()` method is now the recommended way for the quantization initialization for Quantization-Aware Training. Please refer to an [example](examples/quantization_aware_training/tensorflow/mobilenet_v2) for more details about how to use new approach. + - (TensorFlow) Compression layers placement in the model now can be serialized and restored with new API functions: `nncf.tensorflow.get_config()` and `nncf.tensorflow.load_from_config()`. Please see [documentation](/docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model for more details. +- Fixes: + - ... +- Improvements: + - Significantly faster data-free weight compression for OpenVINO models: INT4 compression is now up to 10x faster, while INT8 compression is up to 3x faster. The larger the model the higher the time reduction. + - AWQ weight compression is now up to 2x faster, improving overall runtime efficiency. + - Peak memory usage during INT4 data-free weight compression in the OpenVINO backend is reduced up to 50% for certain models. +- Deprecations/Removals: + - (TensorFlow) The `nncf.tensorflow.create_compressed_model()` method is now marked as deprecated. Please use the `nncf.quantize()` method for the quantization initialization. +- Tutorials: + - [Post-Training Optimization of GLM-Edge-V Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/glm-edge-v/glm-edge-v.ipynb) + - [Post-Training Optimization of OmniGen Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/omnigen/omnigen.ipynb) + - [Post-Training Optimization of Sana Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sana-image-generation/sana-image-generation.ipynb) + - [Post-Training Optimization of BGE Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb) + - [Post-Training Optimization of Stable Diffusion Inpainting Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/inpainting-genai/inpainting-genai.ipynb) + - [Post-Training Optimization of LTX Video Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/ltx-video/ltx-video.ipynb) + - [Post-Training Optimization of DeepSeek-R1-Distill Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb) + - [Post-Training Optimization of Janus DeepSeek-LLM-1.3b Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/janus-multimodal-generation/janus-multimodal-generation.ipynb) +- Known issues: + - ... + +Compression-aware training: + +- Breaking changes: + - ... +- General: + - ... +- Features: + - ... +- Fixes: + - ... +- Improvements: + - ... +- Deprecations/Removals: + - ... +- Tutorials: + - ... +- Known issues: + - ... + +Deprecations/Removals: + +- ... + +Requirements: + +- Update minimal versin for `numpy` (>=1.24.0). +- Removed `tqdm`. + ## New in Release 2.14.1 Post-training Quantization: