-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First release #31
Open
drin
wants to merge
21
commits into
duckdb:main
Choose a base branch
from
drin:feat-skytether-engine
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
First release #31
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
this way I can get my bearings and make necessary changes. I will figure out what to upstream in the future, if anything
Commit ID for Arrow was updated to Arrow 17.0.0 from Arrow 9.0.0 (wow, old)
To some degree, this refactoring is mostly for personal style and readability. Functionality was not changed anywhere, nor were there algorithm changes. Some code was re-organized to make it clearer what variables actually needed to be kept around and some was re-organized to reuse existing functions. Also, some comments were added to make some decisions more understandable.
most simplifications are removing unnecessary code (the arrow stream wrapper calls release in destructor). also changed `SchemaFromIPCBuffer` to take a buffer as an argument instead of a uintptr_t because it can be called at bind time and does not need to go through the C data interface.
The issue was that the `ipc_buffer` attribute wasn't keeping the correct reference alive. So instead of passing a pointer to the super-class constructor, a nullptr is passed and the `stream_factory_ptr` is explicitly set to be sure it points to the `ArrowIPCStreamBuffer` that is kept alive
the one non-style change is to pass an `ArrowIPCBuffer` to `SchemaFromIPCBuffer` instead of a `uintptr_t`. Otherwise, I moved `ArrowIPCTableFunction::GetFunction()` to the end (since it depends on other functions and nothing depends on it) and I made some spacing changes NOTE: The commit removing the duckdb submodule was accidentally skipped during a rebase, so I am adding it to this commit.
This is so that it's easier for duckdb to support using arrow files for ReadRel ops
Also added `arrow_readers.cpp`
Segfault issues were occuring because the arrow buffers were being released too early. Also added a map to keep track of files to avoid opening and re-reading a file many times.
This is due to ubuntu 22.04 having cmake 3.22.1, but changed to match the minimum version specified by duckdb core
This is due to the changes from upstream regarding shared pointers and a desire to make it clearer whether a duckdb::shared_ptr is being used or a std::shared_ptr
"batch" related function and type names were renamed to "partition" related function and type names. This was missed because of misalignment on rebase (due to my style changes).
tired of seeing warnings about cmake <= 3.10
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This would be a first release of my fork of the arrow extension. I'm not sure what to version it, so I'll think about the exact logistics over the next week.
This does not change much of the upstream arrow extension for duckdb, but I would like to add quite a bit of convenience code as well as close the gap with some substrait functionality. Specifically, I would like to eventually:
This release would include initial implementations to support (1) and refactors code (and adds convenience functions) towards satisfying (2). For (3), I have code in another repository that I will try to "upstream" here in the future, depending on if I can think of a way for it to feel cohesive (maybe I'll look at the http extension or something of the sort for guidance).
Also, recent updates to the substrait extension replace TableFunction definitions that return
QueryResult
s to TableFunciton definitions that returnTableRef
s. So, I would like to look into that as well.Anyways, this release is a satisfactory first step in these directions and should make for a good base on which to make future PRs.