onnx-quantization

Model quantization and deployment using ONNX runtime

Requirements

For the onnx-quant.ipynb notebook (obtained through pip): torch, onnx, onnxruntime-gpu (for GPU usage, otherwise onnxruntime will do)
For building the executable for resnet_inference.cpp (download pre-built packages for your OS or build from source): OpenCV, onnxruntime, cmake

Steps

The onnx-quant.ipynb notebook contains code to train a ResNet18 in PyTorch, quantize it and convert it to ONNX.
The resnet_inference.cpp file contains code to run inference using that ONNX model using the C++ API of ONNX runtime.
The CMakeLists.txt file contains the required dependencies to build the executable for resnet_inference.cpp.
- Build steps:
```
    (onnx-quantization) $ mkdir build
    (onnx-quantization) $ cd build
    (build) $ cmake ..
    (build) $ make
```
- An executable called classifier will be created in the build directory.
- Provide the paths to the model file and image file to run inference on as command-line args while running classifier.

References

ONNX runtime documentation
This super helpful repository
Gemini helped with debugging during the build process xD