-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrow IPC serializers and parsers #968
base: main
Are you sure you want to change the base?
Conversation
Please let me know if there is anything needed for this PR. |
@@ -491,6 +491,17 @@ parser_feather <- function(...) { | |||
}) | |||
} | |||
|
|||
#' @describeIn parsers Arrow IPC parser. See [arrow::read_ipc_stream()] for more details. | |||
#' @export | |||
parser_arrow_ipc <- function(...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this should be named parser_arrow_ipc_stream()
or something similar? From Read Arrow IPC stream format — read_ipc_stream • Arrow R Package it seems that there are two IPC formats, "stream" and "file":
Apache Arrow defines two formats for serializing data for interprocess communication (IPC): a "stream" format and a "file" format, known as Feather.
Since the "file" format is synonymous with "feather" (which already has parse_feather()
), my take-away is that stream
is an important aspect we should include in the name. I also like trying to match the naming of the underlying function (arrow::{read,write}_ipc_stream()
) while staying within plumber's naming patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gocha! That's a good point. I believe though that Feather specifically refers to the V1 format. Then V2 is the IPC file—though I think it is a bit of a vagary there.
So, to be crystal clear before merge:
- Change
parser_arrow_ipc()
toparser_arrow_ipc_stream()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it sounds like there some room for confusion around which variant of IPC this is that would be cleared up by parser_arrow_ipc_stream()
, so I'm in favor of making that change before merging. Otherwise the PR looks great, thanks @JosiahParry!
This PR adds support for the Arrow IPC format with an IPC serializer and parser.
PR task list:
devtools::document()