Skip to content

Commit

Permalink
[DOCS] content updates
Browse files Browse the repository at this point in the history
  • Loading branch information
sgolebiewski-intel authored and kblaszczak-intel committed Jan 23, 2025
1 parent 96cba85 commit 1a38177
Show file tree
Hide file tree
Showing 19 changed files with 174 additions and 154 deletions.
2 changes: 1 addition & 1 deletion docs/articles_en/about-openvino/key-features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Easy Integration
OpenVINO optimizations to your PyTorch models directly with a single line of code.
| :doc:`GenAI Out Of The Box <../openvino-workflow-generative/inference-with-genai>`
| With the genAI flavor of OpenVINO, you can run generative AI with just a couple lines of code.
| With the OpenVINO GenAI, you can run generative models with just a few lines of code.
Check out the GenAI guide for instructions on how to do it.
| `Python / C++ / C / NodeJS APIs <https://docs.openvino.ai/2024/api/api_reference.html>`__
Expand Down
99 changes: 29 additions & 70 deletions docs/articles_en/about-openvino/release-notes-openvino.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,9 @@ OpenVINO Release Notes
What's new
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

* OpenVINO 2024.6 release includes updates for enhanced stability and improved LLM performance.
* Introduced support for Intel® Arc™ B-Series Graphics (formerly known as Battlemage).
* Implemented optimizations to improve the inference time and LLM performance on NPUs.
* Improved LLM performance with GenAI API optimizations and bug fixes.
* .





Expand All @@ -47,11 +46,8 @@ CPU Device Plugin
GPU Device Plugin
-----------------------------------------------------------------------------------------------

* Device memory copy optimizations have been introduced for inference with **Intel® Arc™ B-Series
Graphics** (formerly known as Battlemage). Since it does not utilize L2 cache for copying memory
between the device and host, a dedicated `copy` operation is used, if inputs or results are
not expected in the device memory.
* ChatGLM4 inference on GPU has been optimized.
* .


NPU Device Plugin
-----------------------------------------------------------------------------------------------
Expand All @@ -76,10 +72,6 @@ Other Changes and Known Issues
Jupyter Notebooks
-----------------------------

* `Visual-language assistant with GLM-Edge-V and OpenVINO <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/glm-edge-v/glm-edge-v.ipynb>`__
* `Local AI and OpenVINO <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/localai/localai.ipynb>`__
* `Multimodal understanding and generation with Janus and OpenVINO <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/janus-multimodal-generation/janus-multimodal-generation.ipynb>`__




Expand Down Expand Up @@ -131,39 +123,19 @@ Discontinued in 2024

* Runtime components:

* Intel® Gaussian & Neural Accelerator (Intel® GNA). Consider using the Neural Processing
Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.
* OpenVINO C++/C/Python 1.0 APIs (see
`2023.3 API transition guide <https://docs.openvino.ai/2023.3/openvino_2_0_transition_guide.html>`__
for reference).
* All ONNX Frontend legacy API (known as ONNX_IMPORTER_API).
* ``PerfomanceMode.UNDEFINED`` property as part of the OpenVINO Python API.
* The OpenVINO property of Affinity API will is no longer available. It has been replaced with CPU
binding configurations (``ov::hint::enable_cpu_pinning``).

* Tools:

* Deployment Manager. See :doc:`installation <../get-started/install-openvino>` and
:doc:`deployment <../get-started/install-openvino>` guides for current distribution
options.
* `Accuracy Checker <https://github.com/openvinotoolkit/open_model_zoo/blob/master/tools/accuracy_checker/README.md>`__.
* `Post-Training Optimization Tool <https://docs.openvino.ai/2023.3/pot_introduction.html>`__
(POT). Neural Network Compression Framework (NNCF) should be used instead.
* A `Git patch <https://github.com/openvinotoolkit/nncf/tree/release_v281/third_party_integration/huggingface_transformers>`__
for NNCF integration with `huggingface/transformers <https://github.com/huggingface/transformers>`__.
The recommended approach is to use `huggingface/optimum-intel <https://github.com/huggingface/optimum-intel>`__
for applying NNCF optimization on top of models from Hugging Face.
* Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used
as a solution.
* The macOS x86_64 debug bins are no longer provided with the OpenVINO toolkit, starting
with OpenVINO 2024.5.
* Python 3.8 is no longer supported, starting with OpenVINO 2024.5.

* As MxNet doesn't support Python version higher than 3.8, according to the
`MxNet PyPI project <https://pypi.org/project/mxnet/>`__,
it is no longer supported by OpenVINO, either.

* Discrete Keem Bay support is no longer supported, starting with OpenVINO 2024.5.
* Support for discrete devices (formerly codenamed Raptor Lake) is no longer available for
NPU.
* The OpenVINO™ Development Tools package (pip install openvino-dev) is no longer available
for OpenVINO releases in 2025.
* Model Optimizer is no longer available. Consider using the
:doc:`new conversion methods <../openvino-workflow/model-preparation/convert-model-to-ir>`
instead. For more details, see the
`model conversion transition guide <https://docs.openvino.ai/2024/documentation/legacy-features/transition-legacy-conversion-api.html>`__.
* Intel® Streaming SIMD Extensions (Intel® SSE) are currently not enabled in the binary
package by default. They are still supported in the source code form.


Deprecated and to be removed in the future
Expand All @@ -175,26 +147,18 @@ Deprecated and to be removed in the future
standard support.
* The openvino-nightly PyPI module will soon be discontinued. End-users should proceed with the
Simple PyPI nightly repo instead. More information in
`Release Policy <https://docs.openvino.ai/2024/about-openvino/release-notes-openvino/release-policy.html#nightly-releases>`__.
* The OpenVINO™ Development Tools package (pip install openvino-dev) will be removed from
installation options and distribution channels beginning with OpenVINO 2025.0.
* Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using the
:doc:`new conversion methods <../openvino-workflow/model-preparation/convert-model-to-ir>`
instead. For more details, see the
`model conversion transition guide <https://docs.openvino.ai/2024/documentation/legacy-features/transition-legacy-conversion-api.html>`__.
* OpenVINO property Affinity API will be discontinued with OpenVINO 2025.0.
It will be replaced with CPU binding configurations (``ov::hint::enable_cpu_pinning``).

`Release Policy <https://docs.openvino.ai/2025/about-openvino/release-notes-openvino/release-policy.html#nightly-releases>`__.
* “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the
future. OpenVINO's dynamic shape models are recommended instead.
* MacOS x86 is no longer recommended for use due to the discontinuation of validation.
Full support will be removed later in 2025.
* The `openvino` namespace of the OpenVINO Python API has been redesigned, removing the nested
`openvino.runtime` module. The old namespace is now considered deprecated and will be
discontinued in 2026.0.




* “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the
future. OpenVINO's dynamic shape models are recommended instead.

* Starting with 2025.0 MacOS x86 is no longer recommended for use due to the discontinuation
of validation. Full support will be removed later in 2025.




Expand All @@ -203,17 +167,13 @@ Legal Information
+++++++++++++++++++++++++++++++++++++++++++++

You may not use or facilitate the use of this document in connection with any infringement
or other legal analysis concerning Intel products described herein.

You agree to grant Intel a non-exclusive, royalty-free license to any patent claim
thereafter drafted which includes subject matter disclosed herein.
or other legal analysis concerning Intel products described herein. All information provided
here is subject to change without notice. Contact your Intel representative to obtain the
latest Intel product specifications and roadmaps.

No license (express or implied, by estoppel or otherwise) to any intellectual property
rights is granted by this document.

All information provided here is subject to change without notice. Contact your Intel
representative to obtain the latest Intel product specifications and roadmaps.

The products described may contain design defects or errors known as errata which may
cause the product to deviate from published specifications. Current characterized errata
are available on request.
Expand All @@ -225,10 +185,9 @@ or from the OEM or retailer.

No computer system can be absolutely secure.

Intel, Atom, Core, Xeon, OpenVINO, and the Intel logo are trademarks
of Intel Corporation in the U.S. and/or other countries.

Other names and brands may be claimed as the property of others.
Intel, Atom, Core, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in
the U.S. and/or other countries. Other names and brands may be claimed as the property of
others.

Copyright © 2025, Intel Corporation. All rights reserved.

Expand Down
4 changes: 2 additions & 2 deletions docs/articles_en/documentation/openvino-ecosystem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ you an overview of a whole ecosystem of tools and solutions under the OpenVINO u

| **GenAI**
| :bdg-link-dark:`Github <https://github.com/openvinotoolkit/openvino.genai>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2024/openvino-workflow-generative/inference-with-genai.html>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html>`
OpenVINO™ GenAI Library aims to simplify running inference of generative AI
models. Check the LLM-powered Chatbot Jupyter notebook to see how GenAI works.
Expand Down Expand Up @@ -113,7 +113,7 @@ generative AI and vision models directly on your computer or edge device using O

| **Tokenizers**
| :bdg-link-dark:`Github <https://github.com/openvinotoolkit/openvino_tokenizers>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2024/openvino-workflow-generative/ov-tokenizers.html>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2025/openvino-workflow-generative/ov-tokenizers.html>`
OpenVINO Tokenizers add text processing operations to OpenVINO.

Expand Down
5 changes: 3 additions & 2 deletions docs/articles_en/get-started/configurations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ potential of OpenVINO™. Check the following list for components used in your w
for details.
| **OpenVINO GenAI Dependencies**
| OpenVINO GenAI is a flavor of OpenVINO, aiming to simplify running generative
AI models. For information on the dependencies required to use OpenVINO GenAI, see the
| OpenVINO GenAI is a tool based on the OpenVNO Runtime but simplifying the process of
running generative AI models. For information on the dependencies required to use
OpenVINO GenAI, see the
:doc:`guide on OpenVINO GenAI Dependencies <configurations/genai-dependencies>`.
| **Open Computer Vision Library**
Expand Down
14 changes: 7 additions & 7 deletions docs/articles_en/get-started/install-openvino.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ Install OpenVINO™ 2024.6
:maxdepth: 3
:hidden:

OpenVINO GenAI <install-openvino/install-openvino-genai>
OpenVINO Runtime on Linux <install-openvino/install-openvino-linux>
OpenVINO Runtime on Windows <install-openvino/install-openvino-windows>
OpenVINO Runtime on macOS <install-openvino/install-openvino-macos>
Create an OpenVINO Yocto Image <install-openvino/install-openvino-yocto>
OpenVINO GenAI Flavor <install-openvino/install-openvino-genai>

.. raw:: html

Expand All @@ -30,13 +30,13 @@ All currently supported versions are:
* 2023.3 (LTS)


.. dropdown:: Effortless GenAI integration with OpenVINO GenAI Flavor
.. dropdown:: Effortless GenAI integration with OpenVINO GenAI

A new OpenVINO GenAI Flavor streamlines application development by providing
LLM-specific interfaces for easy integration of language models, handling tokenization and
text generation. For installation and usage instructions, proceed to
:doc:`Install OpenVINO GenAI Flavor <../openvino-workflow-generative>` and
:doc:`Run LLMs with OpenVINO GenAI Flavor <../openvino-workflow-generative/inference-with-genai>`.
OpenVINO GenAI streamlines application development by providing LLM-specific interfaces for
easy integration of language models, handling tokenization and text generation.
For installation and usage instructions, check
:doc:`OpenVINO GenAI installation <../openvino-workflow-generative>` and
:doc:`inference with OpenVINO GenAI <../openvino-workflow-generative/inference-with-genai>`.

.. dropdown:: Building OpenVINO from Source

Expand Down
Original file line number Diff line number Diff line change
@@ -1,24 +1,26 @@
Install OpenVINO™ GenAI
====================================

OpenVINO GenAI is a new flavor of OpenVINO, aiming to simplify running inference of generative AI models.
It hides the complexity of the generation process and minimizes the amount of code required.
You can now provide a model and input context directly to OpenVINO, which performs tokenization of the
input text, executes the generation loop on the selected device, and returns the generated text.
For a quickstart guide, refer to the :doc:`GenAI API Guide <../../openvino-workflow-generative/inference-with-genai>`.

To see GenAI in action, check the Jupyter notebooks:
`LLM-powered Chatbot <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/README.md>`__ and
OpenVINO GenAI is a tool, simplifying generative AI model inference. It is based on the
OpenVINO Runtime, hiding the complexity of the generation process and minimizing the amount of
code required. You provide a model and the input context directly to the tool, while it
performs tokenization of the input text, executes the generation loop on the selected device,
and returns the generated content. For a quickstart guide, refer to the
:doc:`GenAI API Guide <../../openvino-workflow-generative/inference-with-genai>`.

To see OpenVINO GenAI in action, check these Jupyter notebooks:
`LLM-powered Chatbot <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/README.md>`__
and
`LLM Instruction-following pipeline <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-question-answering/README.md>`__.

The OpenVINO GenAI flavor is available for installation via PyPI and Archive distributions.
OpenVINO GenAI is available for installation via PyPI and Archive distributions.
A `detailed guide <https://github.com/openvinotoolkit/openvino.genai/blob/releases/2024/3/src/docs/BUILD.md>`__
on how to build OpenVINO GenAI is available in the OpenVINO GenAI repository.

PyPI Installation
###############################

To install the GenAI flavor of OpenVINO via PyPI, follow the standard :doc:`installation steps <install-openvino-pip>`,
To install the GenAI package via PyPI, follow the standard :doc:`installation steps <install-openvino-pip>`,
but use the *openvino-genai* package instead of *openvino*:

.. code-block:: python
Expand All @@ -28,9 +30,9 @@ but use the *openvino-genai* package instead of *openvino*:
Archive Installation
###############################

The OpenVINO GenAI archive package includes the OpenVINO™ Runtime and :doc:`Tokenizers <../../openvino-workflow-generative/ov-tokenizers>`.
To install the GenAI flavor of OpenVINO from an archive file, follow the standard installation steps for your system
but instead of using the vanilla package file, download the one with OpenVINO GenAI:
The OpenVINO GenAI archive package includes the OpenVINO™ Runtime, as well as :doc:`Tokenizers <../../openvino-workflow-generative/ov-tokenizers>`.
It installs the same way as the standard OpenVINO Runtime, so follow its installation steps,
just use the OpenVINO GenAI package instead:

Linux
++++++++++++++++++++++++++
Expand Down
4 changes: 2 additions & 2 deletions docs/articles_en/openvino-workflow-generative.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ The advantages of using OpenVINO for generative model deployment:

Proceed to guides on:

* :doc:`OpenVINO GenAI Flavor <./openvino-workflow-generative/inference-with-genai>`
* :doc:`OpenVINO GenAI <./openvino-workflow-generative/inference-with-genai>`
* :doc:`Hugging Face and Optimum Intel <./openvino-workflow-generative/inference-with-optimum-intel>`
* `Generative AI with Base OpenVINO <https://docs.openvino.ai/2024/openvino-workflow-generative/llm-inference-native-ov.html>`__
* `Generative AI with Base OpenVINO <https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-native-ov>`__


Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@ Inference with OpenVINO GenAI
===============================================================================================

.. meta::
:description: Learn how to use the OpenVINO GenAI flavor to execute LLM models.
:description: Learn how to use OpenVINO GenAI to execute LLM models.

.. toctree::
:maxdepth: 1
:hidden:

NPU inference of LLMs <inference-with-genai-on-npu>
NPU inference of LLMs <inference-with-genai/inference-with-genai-on-npu>


OpenVINO™ GenAI is a library of pipelines and methods, extending the OpenVINO runtime to work
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@ Inference with OpenVINO GenAI
==========================================

.. meta::
:description: Learn how to use the OpenVINO GenAI flavor to execute LLM models on NPU.
:description: Learn how to use OpenVINO GenAI to execute LLM models on NPU.

This guide will give you extra details on how to utilize NPU with the GenAI flavor.

This guide will give you extra details on how to utilize NPU with OpenVINO GenAI.
:doc:`See the installation guide <../../get-started/install-openvino/install-openvino-genai>`
for information on how to start.

Expand All @@ -24,6 +25,10 @@ Note that for systems based on Intel® Core™ Ultra Processors Series 2, more t
may be required to run prompts over 1024 tokens on models exceeding 7B parameters,
such as Llama-2-7B, Mistral-0.2-7B, and Qwen-2-7B.

Make sure your model works with NPU. Some models may not be supported, for example,
**the FLUX.1 pipeline is currently not supported by the device**.


Export an LLM model via Hugging Face Optimum-Intel
##################################################

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ for more streamlined resource management.
NPU Plugin is now available through all relevant OpenVINO distribution channels.

| **Supported Platforms:**
| Host: Intel® Core™ Ultra (former Meteor Lake)
| Host: Intel® Core™ Ultra series
| NPU device: NPU 3720
| OS: Ubuntu* 22.04 64-bit (with Linux kernel 6.6+), MS Windows* 11 64-bit (22H2, 23H2)
Expand All @@ -33,10 +33,10 @@ Follow the instructions below to install the latest NPU drivers:
* `Linux driver <https://github.com/intel/linux-npu-driver/releases>`__


The plugin uses the graph extension API exposed by the driver to convert the OpenVINO specific representation
of the model into a proprietary format. The compiler included in the user mode driver (UMD) performs
platform specific optimizations in order to efficiently schedule the execution of network layers and
memory transactions on various NPU hardware submodules.
The plugin uses the graph extension API exposed by the driver to convert the OpenVINO specific
representation of the model into a proprietary format. The compiler included in the user mode
driver (UMD) performs platform specific optimizations in order to efficiently schedule the
execution of network layers and memory transactions on various NPU hardware submodules.

To use NPU for inference, pass the device name to the ``ov::Core::compile_model()`` method:

Expand Down
Loading

0 comments on commit 1a38177

Please sign in to comment.