onnxmlir-triton-backend

A triton backend which allows the usage of onnx-mlir or zDLC compiled models (model.so) with the triton inference server.

At the moment there is no GPU support.

Model Repository

The backend expects the compiled model to be named model.so The complete structure looks

  <model-repository-path>/
    <model-name>/
      config.pbtxt
      <version>/
        model.so
      <version>/
        model.so
      ...

Model Configuration

Specify the backend name onnxmlir in the config.pbtxt:

backend: "onnxmlir"

For more options see Model Configuration.

Build and Install

You can either build the backend and copy the shared library manually to your triton installation or let make install directly install it tor your triton installation.

Manual Install

mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
make install

produces build/install/backends/onnxmlir/libtriton_onnxmlir.so

The libtriton_onnxmlir.so needs to be copied to backends/onnxmlir directory in the triton installation directory (usually /opt/tritonserver).

Direct Install

You can specify the triton install directory on the cmake command so make install will install it directly in your trition installation. If triton is installed to /opt/tritonserver you can use

mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=/opt/tritonserver ..
make install

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

onnxmlir-triton-backend

Model Repository

Model Configuration

Build and Install

Manual Install

Direct Install

Files

README.md

Latest commit

History

README.md

File metadata and controls

onnxmlir-triton-backend

Model Repository

Model Configuration

Build and Install

Manual Install

Direct Install