Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6448] FilterReduceExpressionsRule returns empty RelNode for R… #3849

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions core/src/main/java/org/apache/calcite/rel/core/Filter.java
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import org.apache.calcite.rex.RexProgram;
import org.apache.calcite.rex.RexShuttle;
import org.apache.calcite.rex.RexUtil;
import org.apache.calcite.sql.type.SqlTypeName;
import org.apache.calcite.util.Litmus;

import com.google.common.collect.ImmutableList;
Expand Down Expand Up @@ -85,6 +86,8 @@ protected Filter(
RexNode condition) {
super(cluster, traits, child);
this.condition = requireNonNull(condition, "condition");
assert SqlTypeName.BOOLEAN == condition.getType().getSqlTypeName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is always true; for example, before validation where casts are introduced this may not hold.
Also, Calcite does not always represent SQL implicit casts explicitly, so the cast to Boolean which is implied here may not be in the program.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL level (SqlNode) and the algebraic level (Rel/RexNode) are handled differently. The validator as well as the implicit casts logic apply on the SQL level and not at the algebraic level. The algebra is explicit so the assertion for Boolean type makes sense to me.

Other than that it may be better to put the check inside the isValid method which is the canonical way of checking validity and allows also overriding in case some specialized implementation wants to perform the validity check differently.

: "condition should be of BOOLEAN type, but was " + condition.getType();
assert RexUtil.isFlat(condition) : "RexUtil.isFlat should be true for condition " + condition;
assert isValid(Litmus.THROW, null);
this.hints = ImmutableList.copyOf(hints);
Expand Down
2 changes: 2 additions & 0 deletions core/src/main/java/org/apache/calcite/rel/core/Join.java
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ protected Join(
JoinRelType joinType) {
super(cluster, traitSet, left, right);
this.condition = Objects.requireNonNull(condition, "condition");
assert SqlTypeName.BOOLEAN == condition.getType().getSqlTypeName()
: "condition should be of BOOLEAN type, but was " + condition.getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check already exists inside the isValid method. What we may want to do here is include
assert isValid in the constructor as it is done in other RelNodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Join, I tried adding assert isValid in the constructor, but CheckerFramework throws an error:

> Task :core:compileJava
/home/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/rel/core/Join.java:107: error: [method.invocation.invalid] call to isValid(org.apache.calcite.util.Litmus,@org.checkerframework.checker.nullness.qual.Nullable org.apache.calcite.rel.RelNode.Context) not allowed on the given receiver.
    assert isValid(Litmus.THROW, null);
                  ^
  found   : @UnderInitialization(org.apache.calcite.rel.core.Join.class) @NonNull Join
  required: @Initialized @NonNull Join

It looks like we can't call it from the constructor without changing some config? Or it may be simpler to assert that the condition type is boolean in the constructor - just for Join.

I encountered another issue with calling isValid from the constructor. During LogicalJoin instantiation, it calls super and tries to init Join and then it goes through isValid -> getRowType -> deriveRowType -> getSystemFieldList. But LogicalJoin#getSystemFieldList simply returns systemFieldList and its value is null because the object is still initializing and unfortunately super has to be the first call in a constructor (for now). Finally we get an error because of the requireNonNull check on systemFieldList in SqlValidatorUtil#deriveJoinRowType.

this.variablesSet = ImmutableSet.copyOf(variablesSet);
this.joinType = Objects.requireNonNull(joinType, "joinType");
this.joinInfo = JoinInfo.of(left, right, condition);
Expand Down
19 changes: 8 additions & 11 deletions core/src/test/java/org/apache/calcite/test/RelBuilderTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -2650,19 +2650,17 @@ private static RelBuilder assertSize(RelBuilder b,
b.scan("EMP")
.scan("DEPT")
.join(JoinRelType.INNER,
b.or(b.literal(null),
b.and(b.equals(b.field(2, 0, "DEPTNO"), b.literal(1)),
b.equals(b.field(2, 0, "DEPTNO"), b.literal(2)),
b.equals(b.field(2, 1, "DEPTNO"),
b.field(2, 0, "DEPTNO")))))
b.and(b.equals(b.field(2, 0, "DEPTNO"), b.literal(1)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you modifying this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping a NULL literal makes the type of the condition to be of type NULL. This PR is not ready for review, and I am just testing things as tests are failing on my local machine with unrelated errors. I will mark this PR as draft for now.

b.equals(b.field(2, 0, "DEPTNO"), b.literal(2)),
b.equals(b.field(2, 1, "DEPTNO"),
b.field(2, 0, "DEPTNO"))))
.build();
final String expected = ""
+ "LogicalJoin(condition=[false], joinType=[inner])\n"
+ " LogicalTableScan(table=[[scott, EMP]])\n"
+ " LogicalTableScan(table=[[scott, DEPT]])\n";
final String expectedWithoutSimplify = ""
+ "LogicalJoin(condition=[OR(null:NULL, "
+ "AND(=($7, 1), =($7, 2), =($8, $7)))], joinType=[inner])\n"
+ "LogicalJoin(condition=[AND(=($7, 1), =($7, 2), =($8, $7))], joinType=[inner])\n"
+ " LogicalTableScan(table=[[scott, EMP]])\n"
+ " LogicalTableScan(table=[[scott, DEPT]])\n";
assertThat(f.apply(createBuilder()), hasTree(expected));
Expand Down Expand Up @@ -4512,13 +4510,12 @@ private static RelBuilder assertSize(RelBuilder b,
Function<RelBuilder, RelNode> f = b ->
b.scan("EMP")
.filter(
b.or(b.literal(null),
b.and(b.equals(b.field(2), b.literal(1)),
b.equals(b.field(2), b.literal(2)))))
b.and(b.equals(b.field(2), b.literal(1)),
b.equals(b.field(2), b.literal(2))))
.build();
final String expected = "LogicalValues(tuples=[[]])\n";
final String expectedWithoutSimplify = ""
+ "LogicalFilter(condition=[OR(null:NULL, AND(=($2, 1), =($2, 2)))])\n"
+ "LogicalFilter(condition=[AND(=($2, 1), =($2, 2))])\n"
+ " LogicalTableScan(table=[[scott, EMP]])\n";
assertThat(f.apply(createBuilder()), hasTree(expected));
assertThat(f.apply(createBuilder(c -> c.withSimplify(false))),
Expand Down
68 changes: 34 additions & 34 deletions piglet/src/test/java/org/apache/calcite/test/PigRelExTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -145,76 +145,76 @@ public void testMatch() {
}

@Test void testAdd() {
checkTranslation("b + 3", inTree("+($1, 3)"));
checkTranslation("(boolean)(b + 3)", inTree("+($1, 3)"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the meaning of this expression?

}

@Test void testSubtract() {
checkTranslation("b - 3", inTree("-($1, 3)"));
checkTranslation("(boolean)(b - 3)", inTree("-($1, 3)"));
}

@Test void testMultiply() {
checkTranslation("b * 3", inTree("*($1, 3)"));
checkTranslation("(boolean)(b * 3)", inTree("*($1, 3)"));
}

@Test void testMod() {
checkTranslation("b % 3", inTree("MOD($1, 3)"));
checkTranslation("(boolean)(b % 3)", inTree("MOD($1, 3)"));
}

@Test void testDivide() {
checkTranslation("b / 3", inTree("/($1, 3)"));
checkTranslation("c / 3.1", inTree("/($2, 3.1E0:DOUBLE)"));
checkTranslation("(boolean)(b / 3)", inTree("/($1, 3)"));
checkTranslation("(boolean)(c / 3.1)", inTree("/($2, 3.1E0:DOUBLE)"));
}

@Test void testBinCond() {
checkTranslation("(b == 1 ? 2 : 3)", inTree("CASE(=($1, 1), 2, 3)"));
checkTranslation("(boolean)(b == 1 ? 2 : 3)", inTree("CASE(=($1, 1), 2, 3)"));
}

@Test void testTupleDereference() {
checkTranslation("k2.k21", inTree("[$11.k21]"));
checkTranslation("k2.(k21, k22)", inTree("[ROW($11.k21, $11.k22)]"));
checkTranslation("k2.k22.(k221,k222)",
inTree("[ROW($11.k22.k221, $11.k22.k222)]"));
checkTranslation("(boolean)k2.k21", inTree("$11.k21"));
checkTranslation("(boolean)k2.(k21, k22)", inTree("ROW($11.k21, $11.k22)"));
checkTranslation("(boolean)k2.k22.(k221,k222)",
inTree("ROW($11.k22.k221, $11.k22.k222)"));
}

@Test void testBagDereference() {
checkTranslation("l2.l22", inTree("[MULTISET_PROJECTION($13, 1)]"));
checkTranslation("l2.(l21, l22)", inTree("[MULTISET_PROJECTION($13, 0, 1)]"));
checkTranslation("(boolean)l2.l22", inTree("MULTISET_PROJECTION($13, 1)"));
checkTranslation("(boolean)l2.(l21, l22)", inTree("MULTISET_PROJECTION($13, 0, 1)"));
}

@Test void testMapLookup() {
checkTranslation("m2#'testKey'", inTree("ITEM($15, 'testKey')"));
checkTranslation("(boolean)(m2#'testKey')", inTree("ITEM($15, 'testKey')"));
}

@Test void testCast() {
checkTranslation("(int) b", inTree("CAST($1):INTEGER"));
checkTranslation("(long) a", inTree("CAST($0):BIGINT"));
checkTranslation("(float) b", inTree("CAST($1):REAL"));
checkTranslation("(double) b", inTree("CAST($1):DOUBLE"));
checkTranslation("(chararray) b", inTree("CAST($1):VARCHAR"));
checkTranslation("(bytearray) b", inTree("CAST($1):BINARY"));
checkTranslation("(boolean) c", inTree("CAST($2):BOOLEAN"));
checkTranslation("(biginteger) b", inTree("CAST($1):DECIMAL(19, 0)"));
checkTranslation("(bigdecimal) b", inTree("CAST($1):DECIMAL(19, 0)"));
checkTranslation("(tuple()) b", inTree("CAST($1):(DynamicRecordRow[])"));
checkTranslation("(tuple(int, float)) b",
checkTranslation("(boolean)(int) b", inTree("CAST($1):INTEGER"));
checkTranslation("(boolean)(long) a", inTree("CAST($0):BIGINT"));
checkTranslation("(boolean)(float) b", inTree("CAST($1):REAL"));
checkTranslation("(boolean)(double) b", inTree("CAST($1):DOUBLE"));
checkTranslation("(boolean)(chararray) b", inTree("CAST($1):VARCHAR"));
checkTranslation("(boolean)(bytearray) b", inTree("CAST($1):BINARY"));
checkTranslation("(boolean)(boolean) c", inTree("CAST($2):BOOLEAN"));
checkTranslation("(boolean)(biginteger) b", inTree("CAST($1):DECIMAL(19, 0)"));
checkTranslation("(boolean)(bigdecimal) b", inTree("CAST($1):DECIMAL(19, 0)"));
checkTranslation("(boolean)(tuple()) b", inTree("CAST($1):(DynamicRecordRow[])"));
checkTranslation("(boolean)(tuple(int, float)) b",
inTree("CAST($1):RecordType(INTEGER $0, REAL $1)"));
checkTranslation("(bag{}) b",
checkTranslation("(boolean)(bag{}) b",
inTree("CAST($1):(DynamicRecordRow[]) NOT NULL MULTISET"));
checkTranslation("(bag{tuple(int)}) b",
checkTranslation("(boolean)(bag{tuple(int)}) b",
inTree("CAST($1):RecordType(INTEGER $0) MULTISET"));
checkTranslation("(bag{tuple(int, float)}) b",
checkTranslation("(boolean)(bag{tuple(int, float)}) b",
inTree("CAST($1):RecordType(INTEGER $0, REAL $1) MULTISET"));
checkTranslation("(map[]) b",
checkTranslation("(boolean)(map[]) b",
inTree("CAST($1):(VARCHAR NOT NULL, BINARY(1) NOT NULL) MAP"));
checkTranslation("(map[int]) b", inTree("CAST($1):(VARCHAR NOT NULL, INTEGER"));
checkTranslation("(map[tuple(int, float)]) b",
checkTranslation("(boolean)(map[int]) b", inTree("CAST($1):(VARCHAR NOT NULL, INTEGER"));
checkTranslation("(boolean)(map[tuple(int, float)]) b",
inTree("CAST($1):(VARCHAR NOT NULL, RecordType(INTEGER val_0, REAL val_1)) MAP"));
}

@Test void testPigBuiltinFunctions() {
checkTranslation("ABS(-5)", inTree("ABS(-5)"));
checkTranslation("AddDuration(h, 'P1D')",
checkTranslation("(boolean)ABS(-5)", inTree("ABS(-5)"));
checkTranslation("(boolean)AddDuration(h, 'P1D')",
inTree("AddDuration(PIG_TUPLE($7, 'P1D'))"));
checkTranslation("CEIL(1.2)", inTree("CEIL(1.2E0:DOUBLE)"));
checkTranslation("(boolean)CEIL(1.2)", inTree("CEIL(1.2E0:DOUBLE)"));
}
}
Loading