Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CESQL v1 review changes #1286

Merged
merged 11 commits into from
May 30, 2024
37 changes: 21 additions & 16 deletions cesql/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ CloudEvent instances.
CloudEvents SQL expressions (also known as CESQL) allow computing values and matching of CloudEvent attributes against complex expressions
that lean on the syntax of Structured Query Language (SQL) `WHERE` clauses. Using SQL-derived expressions for message
filtering has widespread implementation usage because the Java Message Service (JMS) message selector syntax also leans
on SQL. Note that neither the SQL standard (ISO 9075) nor the JMS standard nor any other SQL dialect is used as a
on SQL. Note that neither the [SQL standard (ISO 9075)][iso-9075] nor the JMS standard nor any other SQL dialect is used as a
normative foundation or to constrain the expression syntax defined in this specification, but the syntax is informed by
them.

CESQL is a _[Total pure functional programming language][total-programming-language-wiki]_ in order to guarantee the
CESQL is a _[total pure functional programming language][total-programming-language-wiki]_ in order to guarantee the
termination of the evaluation of the expression. It features a type system correlated to the [CloudEvents type
system][ce-type-system], and it features boolean and arithmetic operations, as well as built-in functions for string
manipulation.
Expand All @@ -56,7 +56,10 @@ producer, or in an intermediary, and it can be implemented using any technology
The CloudEvents Expression Language assumes the input always includes, but is not limited to, a single valid and
type-checked CloudEvent instance. An expression MUST NOT mutate the value of the input CloudEvent instance, nor any of
the other input values. The evaluation of an expression observes the concept of [referential
transparency][referential-transparency-wiki]. The output of a CESQL expression evaluation is always a _boolean_, an _integer_ or a _string_, and it might include an error.
transparency][referential-transparency-wiki]. The primary output of a CESQL expression evaluation is always a _boolean_, an _integer_ or a _string_.
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved
The secondary output of a CESQL expression evaluation is a set of errors which occurred during evaluation. This set MAY be empty, indicating that no
error occurred during execution of the expression. The value used by CESQL engines to represent an empty set of errors is out of the scope of this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the entire representation of, or even access to, errors out of scope of the spec, not just the empty set of errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point - I was trying to not specify that as I wrote the error stuff (apart from specifying the types of errors you could get). Let me clarify that!

specification.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this better. Do you think we need to add something like:

Generation of an error does not halt processing. This specification does not mandate how these errors are exposed to a user.

? Is the first sentence true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure here, if you look at section 4.1 it states that:

When evaluating an expression, the evaluator can operate in two modes, in relation to error handling:

  • Fail fast mode: When an error is triggered, the evaluation is interrupted and returns the error, without any result.
  • Complete evaluation mode: When an error is triggered, the evaluation is continued, and the evaluation of the expression returns both the result and the error(s).

But, as far as I can tell the SDKs implement the fail fast mode instead of the complete evaluation mode.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/o any result... is that true even in the SDKs and Knative? I was assuming the was result 'false'.

I'm having a memory of us discussing what happens when a nested expression errors when that expression is part of an AND - doesn't it result in a false for the nested expression and then false again due to that side of the AND being false? Or am I thinking about some other situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of w/o any result, I think the problem (from a spec standpoint at least) is what type should the return value have? For example, if my query is LEFT(missingattribute, 10), then should I get (false, missing attribute error) as my return value? Does it make more sense to return ("", missing attribute error)?

But, similarly if my expression is LEFT(missingattribute, 10) LIKE prefix%, then having a return value of (false, missing attribute error) would be what I would expect while a return value of ("", missing attribute error) would not be what I expect.

Perhaps it makes sense to say something along the lines of "Fail fast mode: When an error is triggered, the evaluation is interrupted and returns the zero value for the return type of the root operation, along with the error". WDYT @duglin ?


The CloudEvents Expression Language doesn't support the handling of the data field of the CloudEvent instances, due to
its polymorphic nature and complexity. Users that need this functionality ought to use other more appropriate tools.
Expand All @@ -70,7 +73,8 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S

The CESQL can be used as a [filter dialect][subscriptions-filter-dialect] to filter on the input values.

When used as a filter predicate, the expression output value is always cast to a boolean value.
When used as a filter predicate, the expression output value MUST be a _Boolean_. If the output value is not a _Boolean_ or any errors are returned,
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved
the event MUST NOT pass the filter.

<!-- TODO -->
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved

Expand All @@ -95,37 +99,37 @@ INT(hop) < INT(ttl) AND INT(hop) < 1000
The root of the expression is the `expression` rule:

```ebnf
expression ::= value-identifier | boolean-literal | unary-operation | binary-operation | function-invocation | like-operation | exists-operation | in-operation | ( "(" expression ")" )
expression ::= value-identifier | literal | unary-operation | binary-operation | function-invocation | like-operation | exists-operation | in-operation | ( "(" expression ")" )
```

Nested expressions MUST be correctly parenthesized.

### 2.2. Value identifiers and literals

Value identifiers in CESQL MUST follow the same restrictions of the [Attribute Naming
Convention][ce-attribute-naming-convention] from the CloudEvents spec. A value identifier MUST NOT be greater than 20
Convention][ce-attribute-naming-convention] from the CloudEvents spec. A value identifier SHOULD NOT be greater than 20
characters in length.

```ebnf
lowercase-char ::= [a-z]
value-identifier ::= ( lowercase-char | digit ) ( lowercase-char | digit )*
value-identifier ::= ( lowercase-char | digit )+
```

CESQL defines 3 different literal kinds: integer numbers, `true` or `false` booleans, and `''` or `""` delimited strings.
CESQL defines 3 different literal kinds: integer numbers, `true` or `false` booleans, and `''` or `""` delimited strings. Integer literals MUST be valid 32 bit signed integer values.

```ebnf
digit ::= [0-9]
number-literal ::= digit+
integer-literal ::= ( '+' | '-' ) digit+

boolean-literal ::= "true" | "false" (* Case insensitive *)

string-literal ::= ( "'" ( [^'] | "\'" )* "'" ) | ( '"' ( [^"] | '\"' )* '"')

literal ::= number-literal | boolean-literal | string-literal
literal ::= integer-literal | boolean-literal | string-literal
```

Because string literals can be either `''` or `""` delimited, in the former case, the `'` has to be escaped, while in
the latter the `"` has to be escaped.
Because string literals can be either `''` or `""` delimited, in the former case, the `'` character has to be escaped if it is to be used in the string literal, while in
the latter the `"` has to be escaped if it is to be used in the string literal.

### 2.3. Operators

Expand Down Expand Up @@ -212,7 +216,8 @@ runtime errors.

### 3.4. Operators

The following tables show the operators that MUST be supported by a CESQL evaluator.
The following tables show the operators that MUST be supported by a CESQL evaluator. When evaluating an operator,
a CESQL engine MUST attempt to cast the operands to the correct types. If this type casting fails, a _Cast_ error will be returned, along with the zero value for the return type of the expression.
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved

All the operators in the following tables are listed in precedence order.
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved

Expand Down Expand Up @@ -341,12 +346,11 @@ The following tables show the built-in functions that MUST be supported by a CES

| Definition | Semantics |
| ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `ABS(x): Integer -> Integer` | Returns the absolute value of `x` |
| `ABS(x): Integer -> Integer` | Returns the absolute value of `x`. If the value of `x` is `-2147483648` (the most negative 32 bit integer value possible), then this returns `2147483647` as well as a math error. |

### 3.6. Evaluation of the expression

Operators MUST be evaluated in order, where the parenthesized expressions have the highest priority over all the other
operators.
Operators MUST be evaluated in left to right order, where all operators within a parenthized expression MUST be evaluated before continuing to the right of the parenthized expression.
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved

AND and OR operations MUST be short-circuit evaluated. When the left operand of the AND operation evaluates to `false`, the right operand MUST NOT be evaluated. Similarly, when the
left operand of the OR operation evalues to `true`, the right operand MUST NOT be evaluated.
Expand Down Expand Up @@ -470,3 +474,4 @@ hop < ttl
[subscriptions-filter-dialect]: ../subscriptions/spec.md#3241-filter-dialects
[ebnf-xml-spec]: https://www.w3.org/TR/REC-xml/#sec-notation
[modulo-operation-wiki]: https://en.wikipedia.org/wiki/Modulo_operation
[iso-9075]: https://en.wikipedia.org/wiki/ISO/IEC_9075
Loading