Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 1.18 KB

File metadata and controls

22 lines (15 loc) · 1.18 KB

Optimizing PyTorch models with Neural Network Compression Framework of OpenVINO™ by 8-bit quantization.

This tutorial demonstrates how to use NNCF 8-bit quantization to optimize the PyTorch model for inference with OpenVINO Toolkit. For more advanced usage, refer to these examples.

This notebook is based on 'ImageNet training in PyTorch' example. To speed up download and training, use a ResNet-18 model with the Tiny ImageNet dataset.

Notebook Contents

This tutorial consists of the following steps:

  • Transforming the original FP32 model to INT8
  • Using fine-tuning to restore the accuracy.
  • Exporting optimized and original models to ONNX and then to OpenVINO
  • Measuring and comparing the performance of the models.

Installation Instructions

If you have not installed all required dependencies, follow the Installation Guide.