-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyO3: Add optional candle.onnx
module
#1282
Conversation
This looks pretty good, happy to merge it if you want (it's marked as draft currently, just remove the tag when you feel ready). |
Yeah, i will update the CI tomorrow to build the wheels with the onnx feature flag per default. The only problem i have is that the CI seams to fail with some sort of |
I've just disabled the protoc bits on the CI for the time being - agreed that we should restore them at some point. |
Alright, i enable the Since we only need Other than that, this should be ready for a review. |
Great thanks. I think at this stage the pyo3 api offers lots of possibilities and we're mostly lacking on tutorial/get started like material so as to get actual users starting using it (which would be nice to get some feedback on where to push this further). Do you think you could try advertise this a bit, e.g. writing some blog post on whichever platform, posting on reddit (maybe on r/rust and on localllama if it's to advertised the quantized bits) and maybe on some other social platforms? |
I also think, that it would be nice to get some users on board to get some feedback on the api and check if/how it should be expanded. Regarding the tutorials/get started materials, i thought about adding a chapter to the candle book with the basics and maybe some "How to build/port a model" section.
To be frank, is suck at advertising these kind of things. And we should probably upload the newest wheels to pypi (and add some sort of "How to build section") before posting anywhere about this. |
This PR adds a lightweight wrapper for the
candle-onnx
crate.Mainly a
ONNXModel
class is added, which allows to load onnx models, get some metadata information, and infer the models.The
inputs
andoutputs
get exposed to simply check what the model expects as inputs / produces as outputs.The description of these in- and outputs gets wrapped into
ONNXTensorDescription
instances, which expose the expectedDType
andShape
.To run inference a dict containing the input tensors has to be passed into the
run
method.An example of running roberta:
Simillar to the
candle-onnx
crate, thecandle.onnx
module is optional and locked behind theonnx
feature flag. Meaning that the project has to be build withmaturin develop -r --features onnx
to enable it. This should probably be the default behaviour for the CI/CD pipeline.