-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum type support #193
base: main
Are you sure you want to change the base?
Enum type support #193
Conversation
…PGDuckDBEnumTypeInfo to inject the postgres enum OID
…f the enum member (not the enum type oid)
…undtrip these back from duckdb -> postgres in the final result
'enum_oid' changed
using duckdb::LogicalType; | ||
using duckdb::Vector; | ||
|
||
struct PGDuckDBEnum { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be changed to a namespace? Now that this struct doesn't have any fields anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea possibly, in our codebase we're not too strict on static-membered-structs vs namespaces, I think when we also need a constexpr value here we can't use a namespace but can use a struct, so I almost always go for using a struct
I'm waiting on #97 so I can add the tests with |
To further override `EnumTypeInfoTemplated` it needs to be either vendored (bad) or we move it to a header. Relevant `pg_duckdb` PR: <duckdb/pg_duckdb#193>
Regarding the |
// To store additional metadata for a type, DuckDB's LogicalType class contains a 'type_info_' field. | ||
// The type of this is ExtraTypeInfo, ENUMs make use of this in the form of EnumTypeInfo (see | ||
// duckdb/include/common/extra_type_info.hpp). We derive from this further to store the ENUMs connection with the oids | ||
// of Postgres' enum members. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a super helpful comment, thanks a lot.
@@ -68,6 +73,56 @@ IsIndexScan(const Path *nodePath) { | |||
return false; | |||
} | |||
|
|||
optional_ptr<CatalogEntry> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a little comment about the assumptions of this function (feel free to correct this comment if my assumptions are incorrect)
optional_ptr<CatalogEntry> | |
/* Looks up the relevant postgres type and caches it in the SchemaItems.types for the next lookup. This is currently only used to lookup enum types. */ | |
optional_ptr<CatalogEntry> |
dummy_attr.atttypmod = 0; | ||
dummy_attr.attndims = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems incorrect if e.g. it's an array an of enum. I think it would be good to add a test for that, it doesn't have to actually work but at least it should throw a somewhat clear error and not crash.
Also I think it would be good to check here for now that the type is an enum, because that's really the only kind of type that we're comfortable with handling here.
if (en1->enumsortorder < en2->enumsortorder) { | ||
return true; // v1 < v2 | ||
} else if (en1->enumsortorder > en2->enumsortorder) { | ||
return false; // v1 > v2 | ||
} else { | ||
return false; // v1 == v2 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified like this:
if (en1->enumsortorder < en2->enumsortorder) { | |
return true; // v1 < v2 | |
} else if (en1->enumsortorder > en2->enumsortorder) { | |
return false; // v1 > v2 | |
} else { | |
return false; // v1 == v2 | |
} | |
return en1->enumsortorder < en2->enumsortorder; |
// This is medium hacky, we create extra space in the Vector that would normally only hold the strings corresponding to | ||
// the enum's members. In that extra space we store the enum member oids. | ||
|
||
// We do this so we can find the OID from the offset (which is what DuckDB stores in a Vector of type ENUM) and return | ||
// that when converting the result from DuckDB -> Postgres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true anymore right? i.e. you switched back to the first with the vendored in template (until the next duckdb release)
#include "duckdb/common/serializer/deserializer.hpp" | ||
|
||
// This is copied directly without changes from 'duckdb/src/common/extra_type_info.cpp' | ||
// The reason this is copied is to be able to inherit from it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The reason this is copied is to be able to inherit from it. | |
// The reason this is copied is to be able to inherit from it. | |
// This can be removed once DuckDB 1.2.0 is released, which contains this PR: | |
// https://github.com/duckdb/duckdb/pull/14038 |
// that when converting the result from DuckDB -> Postgres | ||
|
||
LogicalType | ||
PGDuckDBEnum::CreateEnumType(std::vector<HeapTuple> &enum_members) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to merge this function ConvertPostgresEnumToDuckEnum
. They are so strongly coupled that I don't think it has any benefit to separate them like this. i.e. ConvertPostgresEnumToDuckEnum
is the only caller of this function and that this takes a vector of HeapTuples, which should already be sorted. If we'd keep them separate that requirement on already sorted should be documented in this function its comment, but if we merge the functions such a comment is not necessary.
This PR adds support for Postgres's ENUM types.
In Postgres ENUM types can not be ephemeral (exist only for the duration of a query), they have to be registered, and thus we can't just recreate the ENUM type when converting back from DuckDB -> Postgres at the end of execution.
This detail caused the biggest development overhead, we need to vendor and inherit from a templated derived class that contains the guts of the DuckDB ENUM type so we can add Postgres's ENUM member oids to it.
We save the ENUM member oids into this, as this is what has to be returned in the query result.