Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL double-colon casting destroys AST topology #137

Open
adrian-datacamp opened this issue Jan 22, 2019 · 10 comments
Open

SQL double-colon casting destroys AST topology #137

adrian-datacamp opened this issue Jan 22, 2019 · 10 comments
Assignees
Labels

Comments

@adrian-datacamp
Copy link

This postgresql query

SELECT CAST(unanswered_count/question_count AS numeric) AS computed_pct, unanswered_pct
  FROM stackoverflow
 WHERE question_count != 0;

generates a clean AST in the AST viewer.

screen shot 2019-01-22 at 3 00 23 pm

However, the equivalent query (in postgresql, at least)

SELECT unanswered_count/question_count::numeric AS computed_pct, unanswered_pct
  FROM stackoverflow
 WHERE question_count != 0;

where the type casting is done with a double colon (::) wipes out the AST topology:

screen shot 2019-01-22 at 3 00 42 pm

A consequence of this is that the topologies of ASTs of queries containing double colons are also broken, subsequently breaking SCTs (for an example, go to Exercise 2.3 in this course).

For additional information, this is severely affecting SCTs for SQL for Exploratory Data Analysis, which purposefully uses this casting syntax throughout the course.

cc: @yashasroy

@adrian-datacamp
Copy link
Author

cc: @mona-kay

@adrian-datacamp
Copy link
Author

@adrian-datacamp
Copy link
Author

Apparently queries without a FROM clause also break the AST topology. For example

SELECT 999;

@hermansje
Copy link
Member

Red nodes in the AST viewer (+ everything being on a single level) indicates that the parser doesn't understand the query.
(Unshaped orange nodes indicate we can parse it, but not know how to reshape it at that point in the tree)

I will add this to our documentation.

Both constructs (:: casting and SELECT without FROM) should be part of the next release.

@adrian-datacamp
Copy link
Author

As it turns out, WITH also seems to break the AST. Try for example this query:

WITH bins AS (
      SELECT generate_series(0, 9, 1) AS lower,
             generate_series(1, 10, 1) AS upper),
     -- subset stackoverflow to just tag dropbox
     dropbox AS (
      SELECT question_count 
        FROM stackoverflow
       WHERE tag='dropbox') 
-- select lower, upper, and count(*)
SELECT lower, upper
  -- from bins created above
  FROM bins;

@hermansje
Copy link
Member

It's not the WITH, it's the SELECT without FROM causing the issue:
ast-viewer

@adrian-datacamp
Copy link
Author

Ah, that makes sense. Thanks for pointing out!

@adrian-datacamp
Copy link
Author

cc: @sumedh10

@adrian-datacamp
Copy link
Author

adrian-datacamp commented Jan 28, 2019

@hermansje,

I just discovered that ILIKE (postgres extension of LIKE), also breaks the AST. Example:

SELECT count(*)
  FROM evanston311
 WHERE description ILIKE '%trash%';

Replacing ILIKE with LIKE in the query works fine on the AST viewer.

@adrian-datacamp
Copy link
Author

@hermansje, all three problematic cases reported on this issue are resolved by

https://github.com/datacamp/antlr-plsql/releases/download/v0.7.2-alpha.3/antlr_plsql-0.7.2-py3-none-any.whl

Once you deploy, this issue can be closed. Thanks a lot for the hard work! 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants