Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance unnest support in decoupled mode #17550

Open
wants to merge 107 commits into
base: master
Choose a base branch
from

Conversation

kgyrtkirk
Copy link
Member

  • adds DruidRelFieldTrimmer
  • Unnest now uses unnestFieldType instead of blindly taking the rowType
  • removed the possibly problematic column reuse in UnnestCleanupRule (the trimmer should take care of that)
  • lots of iq file changes because earlier no proper trimming was done below LogicalCorrelate and Unnest -s

return makeIdentityMapping(input);
}

protected TrimResult dummyProject(int fieldCount, RelNode input,

Check notice

Code scanning / CodeQL

Missing Override annotation

This method overrides [RelFieldTrimmer.dummyProject](1); it is advisable to add an Override annotation.
return result(input, mapping);
}

public TrimResult trimFields(

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
CalciteTestBase.makeColumnExpression
should be avoided because it has been deprecated.
@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Dec 11, 2024
@kgyrtkirk kgyrtkirk marked this pull request as ready for review December 16, 2024 07:50
import java.util.List;
import java.util.Set;

public class DruidRelFieldTrimmer extends RelFieldTrimmer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A javadoc for this class would be really useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a few lines; main goal of the class is the same as RelFieldTrimmer - so I've added a link to it in the apidoc as all of those also applies here

.columnTypes(ColumnType.LONG)
.context(defaultScanQueryContext(
queryContext,
RowSignature.builder().add("v0", ColumnType.LONG).build()
RowSignature.builder().add("__time", ColumnType.LONG).build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand why the column name changes here. I don't see any other change to the test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no columns are needed from the left hand side; but Calcite has some tweaks here and there to avoid relnodes with 0 columns.

This PR suppresses the introduction of a column with the 0 value by the Fieldtrimmer
but in one of the rules there is also an unconditional projection of the 1st columns

which causes this change.

We should be able to handle the case of empty columns - I wanted to dig into that more deeply ; as it seems like there are also some rule combinations which may lead to an empty column set...(but forgot the testcase)

I think that in general Calcite should be able to handle these things - and we should only make
corrections in the execution engine if it causes issues.

@@ -92,7 +92,8 @@ enum Modes
SORT_REMOVE_CONSTANT_KEYS_CONFLICT(DruidException.class, "not enough rules"),
REQUIRE_TIME_CONDITION(CannotBuildQueryException.class, "requireTimeCondition is enabled"),
UNNEST_INLINED(Exception.class, "Missing conversion is Uncollect"),
UNNEST_RESULT_MISMATCH(AssertionError.class, "(Result count mismatch|column content mismatch)");
UNNEST_RESULT_MISMATCH(AssertionError.class, "(Result count mismatch|column content mismatch)"),
RESULT_MISMATCH_NATIVE_UNNEST_INCORRECT_RESULTS(Throwable.class, "(Result count mismatch|column content mismatch|ARRAY)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? Won't it be covered by the previous one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants