Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise error when multiple IPC streams are received #23

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pmm-motif
Copy link

@pmm-motif pmm-motif commented Nov 25, 2023

It's perhaps an obscure case, but I had issues with using Node.js DuckDB library with register_buffer() when multiple Arrow Table streams were passed as an argument. This use-case somewhat works with DuckDB WASM (it's possible to append additional Arrow Tables) so I initially thought it's some sort of bug.

However, as I was investigating this, I realized that I'm probably using it incorrectly in the first place (though it's not clearly documented). I.e. when multiple IPC Buffers are provided, they must be all part of the same Arrow IPC Stream.

The underlying Arrow IPC Reader seems to be silently stopping when getting into EOS. What I'm proposing in this PR is detecting when more messages are available after getting EOS and throwing an exception rather than just silently consuming these.

It's plausible that I'm missing something or that this check should go Apache Arrow instead. Curious of your thoughts on this

Thank you!

@pmm-motif pmm-motif force-pushed the arrow-scan-multi-ipcs branch from 503d783 to 9981d03 Compare November 25, 2023 19:49
@pmm-motif pmm-motif changed the title Raise error when multiple IPC tables are received Raise error when multiple IPC streams are received Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant