Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(duckdb): ensure that duckdb columns argument to read_csv accepts duckdb syntax not ibis syntax #10696

Merged
merged 5 commits into from
Jan 22, 2025

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Jan 21, 2025

Fixes an issue where the input strings in the columns (and types) argument to read_csv assumed the input type strings are ibis types, whereas they are actually DuckDB types. Closes #10695.

BREAKING CHANGE: The duckdb backend's read_csv method accepts only DuckDB types for the values components of the columns and types dictionary arguments. You may need need to adjust existing code. For example, the string "float64" should be replaced with the string "double".

@cpcloud cpcloud added this to the 10.0 milestone Jan 21, 2025
@cpcloud cpcloud added bug Incorrect behavior inside of ibis duckdb The DuckDB backend labels Jan 21, 2025
@cpcloud cpcloud requested a review from gforsyth January 21, 2025 18:26
@github-actions github-actions bot added the tests Issues or PRs related to tests label Jan 21, 2025
Copy link
Contributor

ACTION NEEDED

Ibis follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message.

Please update your PR title and description to match the specification.

@cpcloud cpcloud force-pushed the fix-use-of-ibis-type-parsing branch 3 times, most recently from ba8b94a to 1fd9a4a Compare January 21, 2025 18:33
@cpcloud cpcloud changed the title fix use of ibis type parsing fix(duckdb): ensure that duckdb columns argument to read_csv accepts duckdb syntax not ibis syntax Jan 21, 2025
@cpcloud
Copy link
Member Author

cpcloud commented Jan 22, 2025

Apparently, this was intended to support ibis types in the mapping, so the only way I can see to address the issue without breaking existing code is to allow integer in the ibis type parser (the meaning would correspond to duckdb's bigint though).

cc @amoeba

@cpcloud
Copy link
Member Author

cpcloud commented Jan 22, 2025

I think this is probably better to break for 10.0 and use the behavior that's in this PR, which is that read_csv's columns and types argument for duckdb are in the lingo of DuckDB.

That is, the types are spelled the way duckdb spells them, not the way Ibis spells them.

@cpcloud cpcloud force-pushed the fix-use-of-ibis-type-parsing branch from 1fd9a4a to a020c14 Compare January 22, 2025 10:31
@cpcloud cpcloud added the breaking change Changes that introduce an API break at any level label Jan 22, 2025
@cpcloud cpcloud merged commit 83bed74 into ibis-project:main Jan 22, 2025
90 checks passed
@cpcloud cpcloud deleted the fix-use-of-ibis-type-parsing branch January 22, 2025 12:18
@amoeba
Copy link
Contributor

amoeba commented Jan 22, 2025

That is, the types are spelled the way duckdb spells them, not the way Ibis spells them.

I think this changes matches the behavior I'd expect so 👍 here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that introduce an API break at any level bug Incorrect behavior inside of ibis duckdb The DuckDB backend tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: ibis.read_csv errors with "parsy.ParseError: expected 'EOF' at 0:3" when specifying columns argument
2 participants