-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema-full formats? #208
Comments
I have taken a look at protobuf, flatbuffers and cap'n proto and concluded that there are issues with each of these formats that it make it difficult. I have even been in touch with Kenton Varda, the main author of cap'n proto, and he has confirmed that I would need significant amounts of reverse-engineering to make this work. However, I think that Apache Avro (https://avro.apache.org/docs/1.12.0/) looks very promising and I will open an issue for it. |
The protobuf format isn't so complex. See struct_pb which implements a simple system to serialize into protobuf. |
Hi @colinator and @tsurumi-yizhou, so, I've got some updates on this issue. I have devised a system on how we can use the reflection API for schemaful formats and the first format I have chosen to support is Apache Avro (https://avro.apache.org), just because its C API lends itself to doing something like this (https://avro.apache.org/docs/1.11.1/api/c/). The next one I have in mind is cap'n proto, which actually has an API specifically for this purpose (https://capnproto.org/cxx.html#dynamic-reflection). But I will certainly take a look at the library @tsurumi-yizhou suggested and how they approach this problem. If we could also support protobuf, that would be great. |
This is good news - I've long wanted a c++ library that can serialize both to flex and flat buffers, for example - or any other format. The neat thing about serialization formats such as flatbuffers (or cap'n proto) for example, is that they support "0-copy". Meaning, if your data contains a large thing (such as an image tensor), then no expensive memory allocation/copy is needed (at least at the user level) in order to read it. I'm concerned that reflect-cpp cannot support 0-copy, which negates one of the big benefits of some of the formats. See my issue #207. I'd be happy to chat more about this. |
@colinator , it might be possible to support 0-copy operations for some of these formats. So, instead of writing this: struct Person{
std::string first_name;
std::string last_name;
}; You could write this: struct Person{
capnp::Text first_name;
capnp::Text last_name;
}; But it is harder to see what we could do about vectors...at the end of the day it is the philosophy of reflect-cpp to closely integrate with the C++ standard library. |
That being said, protobuf actually has a reflection API that looks very promising: https://protobuf.dev/reference/cpp/api-docs/google.protobuf.message/ And it seems we can also support flatbuffers if we are able to reverse-engineer the algorithm they use to calculate the offsets (which shouldn't be hard): https://dbaileychess.github.io/flatbuffers/flatbuffers_internals.html |
Would it be possible (and if so how difficult) to add 'schema-full' serialization formats? For instance protobuf, flatbuffers, cap'n proto, dds, etc.
Obviously it would require providing the schema, or maybe the compiled artifacts (as in protobuf for instance) at compile-time. And I realize it doesn't really map to this library. But something that could easily translate between, say, protobuf and flex buffers, or dds and json, would be really useful.
The text was updated successfully, but these errors were encountered: