Model quantization and deployment using ONNX runtime
- For the
onnx-quant.ipynb
notebook (obtained through pip):torch
,onnx
,onnxruntime-gpu
(for GPU usage, otherwiseonnxruntime
will do) - For building the executable for
resnet_inference.cpp
(download pre-built packages for your OS or build from source):OpenCV
,onnxruntime
,cmake
- The onnx-quant.ipynb notebook contains code to train a ResNet18 in PyTorch, quantize it and convert it to ONNX.
- The resnet_inference.cpp file contains code to run inference using that ONNX model using the C++ API of ONNX runtime.
- The CMakeLists.txt file contains the required dependencies to build the executable for
resnet_inference.cpp
.- Build steps:
(onnx-quantization) $ mkdir build (onnx-quantization) $ cd build (build) $ cmake .. (build) $ make
- An executable called
classifier
will be created in thebuild
directory. - Provide the paths to the model file and image file to run inference on as command-line args while running
classifier
.
- ONNX runtime documentation
- This super helpful repository
- Gemini helped with debugging during the build process xD