Skip to content

junwenwu/compression_demo

Repository files navigation

Optimizing TensorFlow models with Neural Network Compression Framework of OpenVINO by 8-bit quantization.

This tutorial demonstrates how to use NNCF 8-bit quantization to optimize the TensorFlow model for inference with OpenVINO Toolkit. For more advanced usage refer to these examples.

To make downloading and training fast, we use a ResNet-18 model with the Imagenette dataset. Imagenette is a subset of 10 easily classified classes from the ImageNet dataset.

The ImageNet dataset can be donwloaded from here

This tutorial consists of the following steps:

  • Fine-tuning of FP32 model
  • Transform the original FP32 model to INT8
  • Use fine-tuning to restore the accuracy
  • Export optimized and original models to Frozen Graph and then to OpenVINO
  • Measure and compare the performance of the models

Installation Instructions

conda create -n venv_demo python=3.7 -y
conda activate venv_demo
pip install tensorflow==2.4.2
pip install openvino-dev==2021.4.2
pip install nncf

Original ResNet18 model weights file is available upon request.

mkdir model
mkdir output

Please put the original weights file ResNet-18_fp32.h5 under the directory model.

Please uncompress the imagenette dataset under the directory dataset.

If you have not done so already, please follow the Installation Guide to install all required dependencies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published