Skip to content

Commit

Permalink
[Release_v2150] Update ReleaseNotes.md (#3214)
Browse files Browse the repository at this point in the history
### Changes

- Added v2.15.0 template;

### Reason for changes

- Upcoming release;

### Related tickets

- 161230;

#### For the contributors:
Please add your changes (as the commit to the branch) to the list
according to the template and previous notes;
Do not add tests-related notes;
Provide the list of the PRs (for all your notes) in the comment for the
discussion;

---------

Co-authored-by: Liubov Talamanova <[email protected]>
Co-authored-by: Nikita Savelyev <[email protected]>
Co-authored-by: Alexander Dokuchaev <[email protected]>
Co-authored-by: Daniil Lyakhov <[email protected]>
Co-authored-by: Andrey Churkin <[email protected]>
  • Loading branch information
6 people authored Feb 3, 2025
1 parent 7fa107b commit 80bd756
Showing 1 changed file with 38 additions and 0 deletions.
38 changes: 38 additions & 0 deletions ReleaseNotes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,43 @@
# Release Notes

## New in Release 2.15.0

Post-training Quantization:

- Features:
- (TensorFlow) The `nncf.quantize()` method is now the recommended API for Quantization-Aware Training. Please refer to an [example](examples/quantization_aware_training/tensorflow/mobilenet_v2) for more details about how to use a new approach.
- (TensorFlow) Compression layers placement in the model now can be serialized and restored with new API functions: `nncf.tensorflow.get_config()` and `nncf.tensorflow.load_from_config()`. Please see the [documentation](docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model for more details.
- (OpenVINO) Added [example](examples/llm_compression/openvino/smollm2_360m_fp8) with LLM quantization to FP8 precision.
- (TorchFX, Experimental) Preview support for the new `quantize_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer` and the `X86InductorQuantizer` quantizers. `quantize_pt2e` API utilizes MinMax algorithm statistic collectors, as well as SmoothQuant, BiasCorrection and FastBiasCorrection Post-Training Quantization algorithms.
- Added unification of scales for ScaledDotProductAttention operation.
- Fixes:
- (ONNX) Fixed sporadic accuracy issues with the BiasCorrection algorithm.
- (ONNX) Fixed GroupConvolution operation weight quantization, which also improves performance for a number of models.
- Fixed AccuracyAwareQuantization algorithm to solve [#3118](https://github.com/openvinotoolkit/nncf/issues/3118) issue.
- Fixed issue with NNCF usage with potentially corrupted backend frameworks.
- Improvements:
- (TorchFX, Experimental) Added YoloV11 support.
- (OpenvINO) The performance of the FastBiasCorrection algorithm was improved.
- Significantly faster data-free weight compression for OpenVINO models: INT4 compression is now up to 10x faster, while INT8 compression is up to 3x faster. The larger the model the higher the time reduction.
- AWQ weight compression is now up to 2x faster, improving overall runtime efficiency.
- Peak memory usage during INT4 data-free weight compression in the OpenVINO backend is reduced by up to 50% for certain models.
- Deprecations/Removals:
- (TensorFlow) The `nncf.tensorflow.create_compressed_model()` method is now marked as deprecated. Please use the `nncf.quantize()` method for the quantization initialization.
- Tutorials:
- [Post-Training Optimization of GLM-Edge-V Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/glm-edge-v/glm-edge-v.ipynb)
- [Post-Training Optimization of OmniGen Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/omnigen/omnigen.ipynb)
- [Post-Training Optimization of Sana Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sana-image-generation/sana-image-generation.ipynb)
- [Post-Training Optimization of BGE Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb)
- [Post-Training Optimization of Stable Diffusion Inpainting Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/inpainting-genai/inpainting-genai.ipynb)
- [Post-Training Optimization of LTX Video Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/ltx-video/ltx-video.ipynb)
- [Post-Training Optimization of DeepSeek-R1-Distill Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb)
- [Post-Training Optimization of Janus DeepSeek-LLM-1.3b Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/janus-multimodal-generation/janus-multimodal-generation.ipynb)

Requirements:

- Updated the minimal version for `numpy` (>=1.24.0).
- Removed `tqdm` dependency.

## New in Release 2.14.1

Post-training Quantization:
Expand Down

0 comments on commit 80bd756

Please sign in to comment.