This specification adds aggregation functionality to the Open Data Protocol (OData) without changing any of the base principles of OData. It defines semantics and a representation for aggregation of data, especially:
- Semantics and operations for querying aggregated data,
- Results format for queries containing aggregated data,
- Vocabulary terms to annotate what can be aggregated, and how.
This specification defines the following terms:
- [Aggregatable Expression]{id=AggregatableExpression} – an expression not involving term casts and resulting in a value of a complex or entity or an aggregatable primitive type
- [Aggregate Expression]{id=AggregateExpression} – argument of the
aggregate
transformation or function defined in [section ##AggregationAlgorithm] - [Aggregatable Primitive Type]{id=AggregatablePrimitiveType} – a primitive type other than
Edm.Stream
or subtypes ofEdm.Geography
orEdm.Geometry
- [Data Aggregation Path]{id=DataAggregationPath} – a path that consists of one or more segments joined together by forward slashes (
/
). Segments are names of declared or dynamic structural or navigation properties, or type-cast segments consisting of the (optionally qualified) name of a structured type that is derived from the type identified by the preceding path segment to reach properties declared by the derived type. - [Expression]{id=Expression} – derived from the
commonExpr
rule (see OData-ABNF) - [Single-Valued Property Path]{id=SingleValuedPropertyPath} – property path ending in a single-valued primitive, complex, or navigation property
The following non-exhaustive list contains variable names that are used throughout this document:
-
$A,B,C$ – collections of instances -
$H$ – hierarchical collection -
$H'$ – subset of nodes from a hierarchical collection -
$u,v,w$ – instances in a collection -
$x$ – an instance in a hierarchical collection, called a node -
$p,q,r$ – paths -
$S,T$ – transformation sequences -
$α$ – aggregate expression, defined in [section ##AggregationAlgorithm] -
$\Gamma(A,p)$ – the collection that results from evaluating a data aggregation path$p$ relative to a collection$A$ , defined in [section ##EvaluationofDataAggregationPaths] -
$γ(u,p)$ – the collection that results from evaluating a data aggregation path$p$ relative to an instance$u$ , defined in [section ##EvaluationofDataAggregationPaths] -
$\Pi_G(s)$ – a transformation of a collection that injects grouping properties into every instance of the collection, defined in [section ##SimpleGrouping] -
$σ(x)$ – instance containing a grouping property that represents a node$x$ , defined in [section ##Transformationtraverse]
Keywords defined by this specification use this monospaced font
.
Some sections of this specification are illustrated with non-normative examples.
::: example Example ##ex: text describing an example uses this paragraph style
Non-normative examples use this paragraph style.
:::
All examples in this document are non-normative and informative only. Examples labeled with ⚠ contain advanced concepts or make use of keywords that are defined only later in the text, they can be skipped at first reading.
All other text is normative unless otherwise labeled.
Open Data Protocol (OData) services expose a data model that describes the schema of the service in terms of the Entity Data Model (EDM, see OData-CSDL) and then allows for querying data in terms of this model. The responses returned by an OData service are based on that data model and retain the relationships between the entities in the model.
Extending the OData query features with simple aggregation capabilities avoids cluttering OData services with an exponential number of explicitly modeled "aggregation level entities" or else restricting the consumer to a small subset of predefined aggregations.
Adding the notion of aggregation to OData without changing any of the base principles in OData has two aspects:
- Means for the consumer to query aggregated data on top of any given data model (for sufficiently capable data providers)
- Means for the provider to annotate what data can be aggregated, and in which way, allowing consumers to avoid asking questions that the provider cannot answer
Implementing any of these two aspects is valuable in itself independent of the other, and implementing both provides additional value for consumers. The provided aggregation annotations help a consumer understand more of the data structure looking at the service's exposed data model. The query extensions allow the consumers to explicitly express the desired aggregation behavior for a particular query. They also allow consumers to formulate queries that utilize the aggregation annotations.
::: example Example ##ex: The following diagram depicts a simple model that is used throughout this document.
<style type="text/css"> .st1 {fill:#f2f2f2;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75} .st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.7} .st3 {fill:#000000;font-size:0.666664em} .st4 {font-size:1em} .st5 {stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75} .st6 {fill:none;visibility:hidden} .st7 {stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:1.4} .st8 {fill:#000000;font-size:0.833336em} .st9 {fill:#000000;font-size:0.833336em;font-style:italic} .st10 {fill:none} .st11 {marker-start:url(#mrkr1-125);stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75;visibility:hidden} .st12 {fill:#000000;fill-opacity:1;stroke:#000000;stroke-opacity:1;stroke-width:0.22935779816514} .st13 {fill:#000000;font-size:0.75em} .st14 {fill:#000000;fill-opacity:1;stroke:#000000;stroke-opacity:1;stroke-width:0.22222222222222} .st15 {stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.72} .st16 {marker-end:url(#mrkr1-200);stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.72} .st17 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75} .st18 {fill:#000000;font-size:0.75em} .st19 {marker-end:url(#mrkr14-311);stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75} .st20 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3} </style> ID: Edm.String {id} Amount: Edm.Decimal Sale Date: Edm.Date {id} Month: Edm.String Quarter: Edm.String Year: Edm.Int16 Time ID: Edm.String {id} Name: Edm.String Country: Edm.String Customer ID: Edm.String {id} Name: Edm.String Category ID: Edm.String {id} Name: Edm.String Color: Edm.String TaxRate: Edm.Decimal Product ID: Edm.String {id} Name: Edm.String SalesOrganization 1 * 1 * * 1 0..1 * 1 * 1 * Sales Sales Customer Product Products Category Time SalesOrganization Superordinate Rating: Edm.Byte FoodProduct RatingClass: Edm.String NonFoodProductThe Amount
property in the Sale
entity type is an aggregatable property, and the properties of the related entity types are groupable. These can be arranged in hierarchies, for example:
- Product hierarchy based on groupable properties of the
Category
andProduct
entity types - Customer hierarchy based on
Country
andCustomer
- Time hierarchy based on
Year
,Month
, andDate
- SalesOrganization hierarchy based on the recursive association to itself
In the context of Online Analytical Processing (OLAP), this model might be described in terms of a Sales "cube" with an Amount "measure" and three "dimensions". This document will avoid such terms, as they are heavily overloaded. :::
Query extensions and descriptive annotations can be applied to normalized schemas as well as partly or fully denormalized schemas.
::: example Example ##ex: The following diagram depicts a denormalized schema for the simple model.
Sale | |
---|---|
Sales | ID: Edm.String {id} |
Amount: Edm.Decimal | |
Category | CategoryID: Edm.String |
CategoryName: Edm.String | |
Product | ProductID: Edm.String |
ProductName: Edm.String | |
ProductColor: Edm.String | |
ProductTaxRate: Edm.Decimal | |
Food | FoodProductRating: Edm.Byte |
Non-Food | NonFoodProductRatingClass: Edm.String |
Sales Organization | SalesOrganizationID: Edm.String |
SalesOrganizationName: Edm.String | |
SalesOrganizationSuperordinateID: Edm.String | |
Time | TimeDate: Edm.Date |
TimeMonth: Edm.String | |
TimeQuarter: Edm.String | |
TimeYear: Edm.Int16 | |
Customer | CustomerID: Edm.String |
CustomerName: Edm.String | |
CustomerCountry: Edm.String |
::: example Example ##ex: The following entity sets and sample data will be used to further illustrate the capabilities introduced by this extension.
:::: {.example-data style=width:600px;height:700px}
::::: {.nav-2 style=left:250px} Products
ID | Category | Name | Color | TaxRate |
---|---|---|---|---|
P1 | PG1 | Sugar | White | 0.06 |
P2 | PG1 | Coffee | Brown | 0.06 |
P3 | PG2 | Paper | White | 0.14 |
P4 | PG2 | Pencil | Black | 0.14 |
::::: |
::::: {style=left:510px} Food
Rating |
---|
5 |
n/a |
n/a |
::::: |
::::: {style=left:570px} Non-Food
RatingClass |
---|
n/a |
n/a |
average |
::::: |
::::: {style=top:150px} Time
Date | Month | Quarter | Year |
---|---|---|---|
2022-01-01 | 2022-01 | 2022-1 | 2022 |
2022-04-01 | 2022-04 | 2022-2 | 2022 |
2022-04-10 | 2022-04 | 2022-2 | 2022 |
… | |||
::::: |
::::: {style=top:150px;left:360px} Categories
ID | Name |
---|---|
PG1 | Food |
PG2 | Non-Food |
::::: |
::::: {.nav-2 style=top:260px;left:360px} Sales Organizations
ID | Superordinate | Name |
---|---|---|
Sales | Corporate Sales | |
US | Sales | US |
US West | US | US West |
US East | US | US East |
EMEA | Sales | EMEA |
EMEA Central | EMEA | EMEA Central |
::::: |
::::: {style=top:300px} Customers
ID | Name | Country |
---|---|---|
C1 | Joe | USA |
C2 | Sue | USA |
C3 | Sue | Netherlands |
C4 | Luc | France |
::::: |
::::: {.nav-2 .nav-3 .nav-4 .nav-5 style=top:450px} Sales
ID | Customer | Time | Product | Sales Organization | Amount |
---|---|---|---|---|---|
1 | C1 | 2022-01-03 | P3 | US West | 1 |
2 | C1 | 2022-04-10 | P1 | US West | 2 |
3 | C1 | 2022-08-07 | P2 | US West | 4 |
4 | C2 | 2022-01-03 | P2 | US East | 8 |
5 | C2 | 2022-11-09 | P3 | US East | 4 |
6 | C3 | 2022-04-01 | P1 | EMEA Central | 2 |
7 | C3 | 2022-08-06 | P3 | EMEA Central | 1 |
8 | C3 | 2022-11-22 | P3 | EMEA Central | 2 |
::::: |
::::: {.legend style=top:470px;left:500px} Legend
Property |
---|
Key |
Navigation Property |
::::: |
:::: |
::: |
::: example Example ##ex: In the example model, one prominent use case is the relation of customers to products. The first question that is likely to be asked is: "Which customers bought which products?"
This leads to the second more quantitative question: "Who bought how much of what?"
The answer to the second question typically is visualized as a cross-table:
:::: cross
Food | Non-Food | |||||
Sugar | Coffee | Paper | ||||
USA | 14 | 2 | 12 | 5 | 5 | |
Joe | 6 | 2 | 4 | 1 | 1 | |
Sue | 8 | 8 | 4 | 4 | ||
Netherlands | 2 | 2 | 3 | 3 | ||
Sue | 2 | 2 | 3 | 3 | ||
:::: |
The data in this cross-table can be written down in a shape that more closely resembles the structure of the data model, leaving cells empty that have been aggregated away:
Customer/Country | Customer/Name | Product/Category/Name | Product/Name | Amount |
---|---|---|---|---|
USA | Joe | Non-Food | Paper | 1 |
USA | Joe | Food | Sugar | 2 |
USA | Joe | Food | Coffee | 4 |
USA | Sue | Food | Coffee | 8 |
USA | Sue | Non-Food | Paper | 4 |
Netherlands | Sue | Food | Sugar | 2 |
Netherlands | Sue | Non-Food | Paper | 3 |
USA | Food | Sugar | 2 | |
USA | Food | Coffee | 12 | |
USA | Non-Food | Paper | 5 | |
Netherlands | Food | Sugar | 2 | |
Netherlands | Non-Food | Paper | 3 | |
USA | Joe | Food | 6 | |
USA | Joe | Non-Food | 1 | |
USA | Sue | Food | 8 | |
USA | Sue | Non-Food | 4 | |
Netherlands | Sue | Food | 2 | |
Netherlands | Sue | Non-Food | 3 | |
USA | Food | 14 | ||
USA | Non-Food | 5 | ||
Netherlands | Food | 2 | ||
Netherlands | Non-Food | 3 |
Note that this result contains seven fully qualified aggregate values, followed by fifteen rollup rows with subtotal values. :::