Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left join with a WHERE condition falsely marks columns as non-nullable #3539

Open
musjj opened this issue Oct 2, 2024 · 2 comments · May be fixed by #3541
Open

Left join with a WHERE condition falsely marks columns as non-nullable #3539

musjj opened this issue Oct 2, 2024 · 2 comments · May be fixed by #3541
Labels

Comments

@musjj
Copy link

musjj commented Oct 2, 2024

Bug Description

When there's a WHERE condition in a left join query, non-null columns from the right table is always marked as non-nullable (in left joins, they should always be nullable).

Minimal Reproduction

The schema:

CREATE TABLE foo (
    id BIGSERIAL PRIMARY KEY
);

CREATE TABLE bar (
    foo_id BIGSERIAL PRIMARY KEY REFERENCES foo (id)
);

With a simple left join, everything works fine:

let rec = sqlx::query!(
    "
    SELECT
        foo.id,
        bar.foo_id
    FROM
        foo
        LEFT JOIN bar ON bar.foo_id = foo.id
    "
)
.fetch_one(&pool)
.await?;

let _: Option<i64> = rec.foo_id; // checks out

But things goes awry when you introduce a WHERE condition:

let rec = sqlx::query!(
    "
    SELECT
        foo.id,
        bar.foo_id
    FROM
        foo
        LEFT JOIN bar ON bar.foo_id = foo.id
    WHERE
        foo.id = $1
    ",
    5
)
.fetch_one(&pool)
.await?;

let _: i64 = rec.foo_id; // hmmm

The issue doesn't occur if you use AND instead:

let rec = sqlx::query!(
    "
    SELECT
        foo.id,
        bar.foo_id
    FROM
        foo
        LEFT JOIN bar ON bar.foo_id = foo.id
        AND foo.id = $1
    ",
    5
)
.fetch_one(&pool)
.await?;

let _: Option<i64> = rec.foo_id;

The full reproduction repo can be found here: https://github.com/musjj/sqlx-join-bug

Execution plans of all relevant queries:

EXPLAIN (VERBOSE, FORMAT JSON) SELECT foo.id, bar.foo_id FROM foo LEFT JOIN bar ON bar.foo_id = foo.id;
                  QUERY PLAN                   
-----------------------------------------------
 [                                            +
   {                                          +
     "Plan": {                                +
       "Node Type": "Hash Join",              +
       "Parallel Aware": false,               +
       "Async Capable": false,                +
       "Join Type": "Left",                   +
       "Startup Cost": 60.85,                 +
       "Total Cost": 99.39,                   +
       "Plan Rows": 2260,                     +
       "Plan Width": 16,                      +
       "Output": ["foo.id", "bar.foo_id"],    +
       "Inner Unique": true,                  +
       "Hash Cond": "(foo.id = bar.foo_id)",  +
       "Plans": [                             +
         {                                    +
           "Node Type": "Seq Scan",           +
           "Parent Relationship": "Outer",    +
           "Parallel Aware": false,           +
           "Async Capable": false,            +
           "Relation Name": "foo",            +
           "Schema": "public",                +
           "Alias": "foo",                    +
           "Startup Cost": 0.00,              +
           "Total Cost": 32.60,               +
           "Plan Rows": 2260,                 +
           "Plan Width": 8,                   +
           "Output": ["foo.id"]               +
         },                                   +
         {                                    +
           "Node Type": "Hash",               +
           "Parent Relationship": "Inner",    +
           "Parallel Aware": false,           +
           "Async Capable": false,            +
           "Startup Cost": 32.60,             +
           "Total Cost": 32.60,               +
           "Plan Rows": 2260,                 +
           "Plan Width": 8,                   +
           "Output": ["bar.foo_id"],          +
           "Plans": [                         +
             {                                +
               "Node Type": "Seq Scan",       +
               "Parent Relationship": "Outer",+
               "Parallel Aware": false,       +
               "Async Capable": false,        +
               "Relation Name": "bar",        +
               "Schema": "public",            +
               "Alias": "bar",                +
               "Startup Cost": 0.00,          +
               "Total Cost": 32.60,           +
               "Plan Rows": 2260,             +
               "Plan Width": 8,               +
               "Output": ["bar.foo_id"]       +
             }                                +
           ]                                  +
         }                                    +
       ]                                      +
     }                                        +
   }                                          +
 ]
(1 row)
EXPLAIN (VERBOSE, FORMAT JSON) SELECT foo.id, bar.foo_id FROM foo LEFT JOIN bar ON bar.foo_id = foo.id WHERE foo.id = 123;
                  QUERY PLAN                  
----------------------------------------------
 [                                           +
   {                                         +
     "Plan": {                               +
       "Node Type": "Nested Loop",           +
       "Parallel Aware": false,              +
       "Async Capable": false,               +
       "Join Type": "Left",                  +
       "Startup Cost": 0.31,                 +
       "Total Cost": 16.36,                  +
       "Plan Rows": 1,                       +
       "Plan Width": 16,                     +
       "Output": ["foo.id", "bar.foo_id"],   +
       "Inner Unique": true,                 +
       "Plans": [                            +
         {                                   +
           "Node Type": "Index Only Scan",   +
           "Parent Relationship": "Outer",   +
           "Parallel Aware": false,          +
           "Async Capable": false,           +
           "Scan Direction": "Forward",      +
           "Index Name": "foo_pkey",         +
           "Relation Name": "foo",           +
           "Schema": "public",               +
           "Alias": "foo",                   +
           "Startup Cost": 0.15,             +
           "Total Cost": 8.17,               +
           "Plan Rows": 1,                   +
           "Plan Width": 8,                  +
           "Output": ["foo.id"],             +
           "Index Cond": "(foo.id = 123)"    +
         },                                  +
         {                                   +
           "Node Type": "Index Only Scan",   +
           "Parent Relationship": "Inner",   +
           "Parallel Aware": false,          +
           "Async Capable": false,           +
           "Scan Direction": "Forward",      +
           "Index Name": "bar_pkey",         +
           "Relation Name": "bar",           +
           "Schema": "public",               +
           "Alias": "bar",                   +
           "Startup Cost": 0.15,             +
           "Total Cost": 8.17,               +
           "Plan Rows": 1,                   +
           "Plan Width": 8,                  +
           "Output": ["bar.foo_id"],         +
           "Index Cond": "(bar.foo_id = 123)"+
         }                                   +
       ]                                     +
     }                                       +
   }                                         +
 ]
(1 row)
EXPLAIN (VERBOSE, FORMAT JSON) SELECT foo.id, bar.foo_id FROM foo LEFT JOIN bar ON bar.foo_id = foo.id AND foo.id = 123;
                  QUERY PLAN                   
-----------------------------------------------
 [                                            +
   {                                          +
     "Plan": {                                +
       "Node Type": "Hash Join",              +
       "Parallel Aware": false,               +
       "Async Capable": false,                +
       "Join Type": "Left",                   +
       "Startup Cost": 60.85,                 +
       "Total Cost": 99.38,                   +
       "Plan Rows": 2260,                     +
       "Plan Width": 16,                      +
       "Output": ["foo.id", "bar.foo_id"],    +
       "Inner Unique": true,                  +
       "Hash Cond": "(foo.id = bar.foo_id)",  +
       "Join Filter": "(foo.id = 123)",       +
       "Plans": [                             +
         {                                    +
           "Node Type": "Seq Scan",           +
           "Parent Relationship": "Outer",    +
           "Parallel Aware": false,           +
           "Async Capable": false,            +
           "Relation Name": "foo",            +
           "Schema": "public",                +
           "Alias": "foo",                    +
           "Startup Cost": 0.00,              +
           "Total Cost": 32.60,               +
           "Plan Rows": 2260,                 +
           "Plan Width": 8,                   +
           "Output": ["foo.id"]               +
         },                                   +
         {                                    +
           "Node Type": "Hash",               +
           "Parent Relationship": "Inner",    +
           "Parallel Aware": false,           +
           "Async Capable": false,            +
           "Startup Cost": 32.60,             +
           "Total Cost": 32.60,               +
           "Plan Rows": 2260,                 +
           "Plan Width": 8,                   +
           "Output": ["bar.foo_id"],          +
           "Plans": [                         +
             {                                +
               "Node Type": "Seq Scan",       +
               "Parent Relationship": "Outer",+
               "Parallel Aware": false,       +
               "Async Capable": false,        +
               "Relation Name": "bar",        +
               "Schema": "public",            +
               "Alias": "bar",                +
               "Startup Cost": 0.00,          +
               "Total Cost": 32.60,           +
               "Plan Rows": 2260,             +
               "Plan Width": 8,               +
               "Output": ["bar.foo_id"]       +
             }                                +
           ]                                  +
         }                                    +
       ]                                      +
     }                                        +
   }                                          +
 ]
(1 row)

Info

  • SQLx version: 0.8.2
  • SQLx features enabled: [ "postgres", "runtime-tokio", "tls-native-tls" ]
  • Database server and version: Postgres 16.3-1.pgdg120+1
  • Operating system: NixOS
  • rustc --version: rustc 1.83.0-nightly (363ae4188 2024-09-24)
@fastfailures
Copy link

It seems the same as #3408

@mxchist
Copy link

mxchist commented Jan 17, 2025

Actually, those columns are non-nullable. You cannot store nulls in primary key in Postgersql.

=>  insert into foo (id) values (null);
ERROR:  null value in column "id" of relation "foo" violates not-null constraint
DETAIL:  Failing row contains (null).

And after you delete the Primary key:

=> alter table foo drop constraint foo_pkey;
ALTER TABLE

The column still will be non-nullable:

                                                     Table "pg_temp_3.foo"
 Column |  Type  | Collation | Nullable |             Default             | Storage | Compression | Stats target | Description 
--------+--------+-----------+----------+---------------------------------+---------+-------------+--------------+-------------
 id     | bigint |           | not null | nextval('foo_id_seq'::regclass) | plain   |             |              | 
Access method: heap

Until you manually drop the NOT NULL property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants