Skip to content

Commit

Permalink
Merge pull request #5 from OpenLineage/add-missing-release-notes
Browse files Browse the repository at this point in the history
Add missing release notes.
  • Loading branch information
JDarDagran authored Jan 17, 2025
2 parents 8b3cdaf + 89e7333 commit 41eedcc
Show file tree
Hide file tree
Showing 13 changed files with 402 additions and 0 deletions.
24 changes: 24 additions & 0 deletions versioned_docs/version-1.21.1/releases/1_21_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: 1.21.1
sidebar_position: 9936
---

# 1.21.1 - 2024-08-29

### Added
* **Spec: add GCP Dataproc facet** [`#2987`](https://github.com/OpenLineage/OpenLineage/pull/2987) [@tnazarew](https://github.com/tnazarew)
*Registers the Google Cloud Platform Dataproc run facet.*

### Fixed
* **Airflow: update SQL integration code to work with latest sqlparser-rs main** [`#2983`](https://github.com/OpenLineage/OpenLineage/pull/2983) [@kacpermuda](https://github.com/kacpermuda)
*Adjusts the SQL integration after our sqlparser-rs fork has been updated to the latest main.*
* **Spark: fix AWS Glue jobs naming for SQL events** [`#3001`](https://github.com/OpenLineage/OpenLineage/pull/3001) [@arturowczarek](https://github.com/arturowczarek)
*SQL events now properly use the names of the jobs retrieved from AWS Glue.*
* **Spark: fix issue with column lineage when using delta merge into command** [`#2986`](https://github.com/OpenLineage/OpenLineage/pull/2986) [@Imbruced](https://github.com/Imbruced)
*A view instance of a node is now included when gathering data sources for input columns.*
* **Spark: minor Spark filters refactor** [`#2990`](https://github.com/OpenLineage/OpenLineage/pull/2990) [@arturowczarek](https://github.com/arturowczarek)
*Fixes a number of minor issues.*
* **Spark: Iceberg tables in AWS Glue have slashes instead of dots in symlinks** [`#2984`](https://github.com/OpenLineage/OpenLineage/pull/2984) [@arturowczarek](https://github.com/arturowczarek)
*They should use slashes and the prefix `table/`.*
* **Spark: lineage for Iceberg datasets that are present outside of Spark's catalog is now present** [`#2937`](https://github.com/OpenLineage/OpenLineage/pull/2937) [@d-m-h](https://github.com/d-m-h)
*Previously, reading Iceberg datasets outside the configured Spark catalog prevented the datasets from being present in the `inputs` property of the `RunEvent`.*
24 changes: 24 additions & 0 deletions versioned_docs/version-1.22.0/releases/1_21_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: 1.21.1
sidebar_position: 9936
---

# 1.21.1 - 2024-08-29

### Added
* **Spec: add GCP Dataproc facet** [`#2987`](https://github.com/OpenLineage/OpenLineage/pull/2987) [@tnazarew](https://github.com/tnazarew)
*Registers the Google Cloud Platform Dataproc run facet.*

### Fixed
* **Airflow: update SQL integration code to work with latest sqlparser-rs main** [`#2983`](https://github.com/OpenLineage/OpenLineage/pull/2983) [@kacpermuda](https://github.com/kacpermuda)
*Adjusts the SQL integration after our sqlparser-rs fork has been updated to the latest main.*
* **Spark: fix AWS Glue jobs naming for SQL events** [`#3001`](https://github.com/OpenLineage/OpenLineage/pull/3001) [@arturowczarek](https://github.com/arturowczarek)
*SQL events now properly use the names of the jobs retrieved from AWS Glue.*
* **Spark: fix issue with column lineage when using delta merge into command** [`#2986`](https://github.com/OpenLineage/OpenLineage/pull/2986) [@Imbruced](https://github.com/Imbruced)
*A view instance of a node is now included when gathering data sources for input columns.*
* **Spark: minor Spark filters refactor** [`#2990`](https://github.com/OpenLineage/OpenLineage/pull/2990) [@arturowczarek](https://github.com/arturowczarek)
*Fixes a number of minor issues.*
* **Spark: Iceberg tables in AWS Glue have slashes instead of dots in symlinks** [`#2984`](https://github.com/OpenLineage/OpenLineage/pull/2984) [@arturowczarek](https://github.com/arturowczarek)
*They should use slashes and the prefix `table/`.*
* **Spark: lineage for Iceberg datasets that are present outside of Spark's catalog is now present** [`#2937`](https://github.com/OpenLineage/OpenLineage/pull/2937) [@d-m-h](https://github.com/d-m-h)
*Previously, reading Iceberg datasets outside the configured Spark catalog prevented the datasets from being present in the `inputs` property of the `RunEvent`.*
24 changes: 24 additions & 0 deletions versioned_docs/version-1.22.0/releases/1_22_0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: 1.22.0
sidebar_position: 9935
---

# 1.22.0 - 2024-09-05

### Added
* **SQL: add support for `USE` statement with different syntaxes** [`#2944`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@kacpermuda](https://github.com/kacpermuda)
*Adjusts our Context so that it can use the new support for this statement in the parser and pass it to a number of queries.*
* **Spark: add script to build Spark dependencies** [`#3044`](https://github.com/OpenLineage/OpenLineage/pull/3044) [@arturowczarek](https://github.com/arturowczarek)
*Adds a script to rebuild dependencies automatically following releases.*
* **Website: versionable docs** [`#3007`](https://github.com/OpenLineage/OpenLineage/pull/3007) [`#3023`](https://github.com/OpenLineage/OpenLineage/pull/3023) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski)
*Adds a GitHub action that creates a new Docusaurus version on a tag push, verifiable using the openlineage-site repo. Implements a monorepo approach in a new `website` directory.*

### Fixed
* **SQL: add support for `SingleQuotedString` in `Identifier()`** [`#3035`](https://github.com/OpenLineage/OpenLineage/pull/3035) [@kacpermuda](https://github.com/kacpermuda)
*Single quoted strings were being treated differently than strings with no quotes, double quotes, or backticks.*
* **SQL: support `IDENTIFIER` function instead of treating it like table name** [`#2999`](https://github.com/OpenLineage/OpenLineage/pull/2999) [@kacpermuda](https://github.com/kacpermuda)
*Adds support for this identifier in SELECT, MERGE, UPDATE, and DELETE statements. For now, only static identifiers are supported. When a variable is used, this table is removed from lineage to avoid emitting incorrect lineage.*
* **Spark: fix issue with only one table in inputs from SQL query while reading from JDBC** [`#2918`](https://github.com/OpenLineage/OpenLineage/pull/2918) [@Imbruced](https://github.com/Imbruced)
*Events created did not contain the correct input table when the query contained multiple tables.*
* **Spark: fix AWS Glue jobs naming for RDD events** [`#3020`](https://github.com/OpenLineage/OpenLineage/pull/3020) [@arturowczarek](https://github.com/arturowczarek)
*The naming for RDD jobs now uses the same code as SQL and Application events.*
40 changes: 40 additions & 0 deletions versioned_docs/version-1.23.0/releases/1_23_0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: 1.23.0
sidebar_position: 9934
---

# 1.23.0 - 2024-10-04

### Added
* **Java: added CompositeTransport** [`#3039`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@JDarDagran](https://github.com/JDarDagran)
*This allows user to specify multiple targets to which OpenLineage events will be emitted.*
* **Spark extension interfaces: support table extended sources** [`#3062`](https://github.com/OpenLineage/OpenLineage/pull/3062) [@Imbruced](https://github.com/Imbruced)
*Interfaces are now able to extract lineage from Table interface, not only RelationProvider.*
* **Java: added GCP Dataplex transport** [`#3043`](https://github.com/OpenLineage/OpenLineage/pull/3043) [@ddebowczyk92](https://github.com/ddebowczyk92)
*Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex.*
* **Java: added Google Cloud Storage transport** [`#3077`](https://github.com/OpenLineage/OpenLineage/pull/3077) [@ddebowczyk92](https://github.com/ddebowczyk92)
*GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage.*
* **Java: added S3 transport** [`#3129`](https://github.com/OpenLineage/OpenLineage/pull/3129) [@arturowczarek](https://github.com/arturowczarek)
*S3 transport is now available as a separate Maven package for users that want to send OL events to S3.*
* **Java: add option to configure client via environment variables** [`#3094`](https://github.com/OpenLineage/OpenLineage/pull/3094) [@JDarDagran](https://github.com/JDarDagran)
*Specified variables are now autotranslated to configuration values.*
* **Python: add option to configure client via environment variables** [`#3114`](https://github.com/OpenLineage/OpenLineage/pull/3114) [@JDarDagran](https://github.com/JDarDagran)
*Specified variables are now autotranslated to configuration values.*
* **Python: add option to add custom headers in HTTP transport** [`#3116`](https://github.com/OpenLineage/OpenLineage/pull/3116) [@JDarDagran](https://github.com/JDarDagran)
*Allows user to add custom headers, for example for auth purposes.*
* **Spec: add full dataset dependencies** [`#3097`](https://github.com/OpenLineage/OpenLineage/pull/3097) [`#3098`](https://github.com/OpenLineage/OpenLineage/pull/3098) [@arturowczarek](https://github.com/arturowczarek)
*Now, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.*
* **Java: OpenLineageClient and Transports are now AutoCloseable** [`#3122`](https://github.com/OpenLineage/OpenLineage/pull/3122) [@ddebowczyk92](https://github.com/ddebowczyk92)
*This prevents a number of issues that might be caused by not closing underlying transports.*

### Fixed
* **Python Facet generator does not validate optional arguments** [`#3054`](https://github.com/OpenLineage/OpenLineage/pull/3054) [@JDarDagran](https://github.com/JDarDagran)
*This fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None.*
* **SQL: report only actually used tables from CTEs, rather than all** [`#2962`](https://github.com/OpenLineage/OpenLineage/pull/2962) [@Imbruced](https://github.com/Imbruced)
*With this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported.*
* **Fluentd: Enhancing plugin's capabilities** [`#3068`](https://github.com/OpenLineage/OpenLineage/pull/3068) [@jonathanlbt1](https://github.com/jonathanlbt1)
*This change enhances performance and docs of fluentd proxy plugin.*
* **SQL: fix parser to point to origin table instead of CTEs** [`#3107`](https://github.com/OpenLineage/OpenLineage/pull/3107) [@Imbruced](https://github.com/Imbruced)
*For some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.*
* **Spark: column lineage correctly produces for merge into command** [`#3095`](https://github.com/OpenLineage/OpenLineage/pull/3095) [@Imbruced](https://github.com/Imbruced)
*Now OL produces CLL correctly for the potential view in the middle.*
40 changes: 40 additions & 0 deletions versioned_docs/version-1.24.2/releases/1_23_0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: 1.23.0
sidebar_position: 9934
---

# 1.23.0 - 2024-10-04

### Added
* **Java: added CompositeTransport** [`#3039`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@JDarDagran](https://github.com/JDarDagran)
*This allows user to specify multiple targets to which OpenLineage events will be emitted.*
* **Spark extension interfaces: support table extended sources** [`#3062`](https://github.com/OpenLineage/OpenLineage/pull/3062) [@Imbruced](https://github.com/Imbruced)
*Interfaces are now able to extract lineage from Table interface, not only RelationProvider.*
* **Java: added GCP Dataplex transport** [`#3043`](https://github.com/OpenLineage/OpenLineage/pull/3043) [@ddebowczyk92](https://github.com/ddebowczyk92)
*Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex.*
* **Java: added Google Cloud Storage transport** [`#3077`](https://github.com/OpenLineage/OpenLineage/pull/3077) [@ddebowczyk92](https://github.com/ddebowczyk92)
*GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage.*
* **Java: added S3 transport** [`#3129`](https://github.com/OpenLineage/OpenLineage/pull/3129) [@arturowczarek](https://github.com/arturowczarek)
*S3 transport is now available as a separate Maven package for users that want to send OL events to S3.*
* **Java: add option to configure client via environment variables** [`#3094`](https://github.com/OpenLineage/OpenLineage/pull/3094) [@JDarDagran](https://github.com/JDarDagran)
*Specified variables are now autotranslated to configuration values.*
* **Python: add option to configure client via environment variables** [`#3114`](https://github.com/OpenLineage/OpenLineage/pull/3114) [@JDarDagran](https://github.com/JDarDagran)
*Specified variables are now autotranslated to configuration values.*
* **Python: add option to add custom headers in HTTP transport** [`#3116`](https://github.com/OpenLineage/OpenLineage/pull/3116) [@JDarDagran](https://github.com/JDarDagran)
*Allows user to add custom headers, for example for auth purposes.*
* **Spec: add full dataset dependencies** [`#3097`](https://github.com/OpenLineage/OpenLineage/pull/3097) [`#3098`](https://github.com/OpenLineage/OpenLineage/pull/3098) [@arturowczarek](https://github.com/arturowczarek)
*Now, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.*
* **Java: OpenLineageClient and Transports are now AutoCloseable** [`#3122`](https://github.com/OpenLineage/OpenLineage/pull/3122) [@ddebowczyk92](https://github.com/ddebowczyk92)
*This prevents a number of issues that might be caused by not closing underlying transports.*

### Fixed
* **Python Facet generator does not validate optional arguments** [`#3054`](https://github.com/OpenLineage/OpenLineage/pull/3054) [@JDarDagran](https://github.com/JDarDagran)
*This fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None.*
* **SQL: report only actually used tables from CTEs, rather than all** [`#2962`](https://github.com/OpenLineage/OpenLineage/pull/2962) [@Imbruced](https://github.com/Imbruced)
*With this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported.*
* **Fluentd: Enhancing plugin's capabilities** [`#3068`](https://github.com/OpenLineage/OpenLineage/pull/3068) [@jonathanlbt1](https://github.com/jonathanlbt1)
*This change enhances performance and docs of fluentd proxy plugin.*
* **SQL: fix parser to point to origin table instead of CTEs** [`#3107`](https://github.com/OpenLineage/OpenLineage/pull/3107) [@Imbruced](https://github.com/Imbruced)
*For some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.*
* **Spark: column lineage correctly produces for merge into command** [`#3095`](https://github.com/OpenLineage/OpenLineage/pull/3095) [@Imbruced](https://github.com/Imbruced)
*Now OL produces CLL correctly for the potential view in the middle.*
29 changes: 29 additions & 0 deletions versioned_docs/version-1.24.2/releases/1_24_2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: 1.24.2
sidebar_position: 9933
---

# 1.24.2 - 2024-11-05

### Added
* **Spark: Add Dataproc run facet to include jobType property** [`#3167`](https://github.com/OpenLineage/OpenLineage/pull/3167) [@codelixir](https://github.com/codelixir)
*Updates the GCP Dataproc run facet to include jobType property.*
* **Add EnvironmentVariablesRunFacet to core spec** [`#3186`](https://github.com/OpenLineage/OpenLineage/pull/3186) [@JDarDagran](https://github.com/JDarDagran)
*Additionally, directly use EnvironmentVariablesRunFacet in Python client.*
* **Add assertions for format in test events** [`#3221`](https://github.com/OpenLineage/OpenLineage/pull/3221) [@JDarDagran](https://github.com/JDarDagran)
* **Spark: Add integration tests for EMR** [`#3142`](https://github.com/OpenLineage/OpenLineage/pull/3142) [@arturowczarek](https://github.com/arturowczarek)
*Spark integration has integration tests for EMR.*

### Changed
* **Move Kinesis to separate module, migrate HTTP transport to httpclient5** [`#3205`](https://github.com/OpenLineage/OpenLineage/pull/3205) [@mobuchowski](https://github.com/mobuchowski)
*Moves Kinesis integration to a separate module and updates HTTP transport to use HttpClient 5.x.*
* **Docs: Upgrade docusaurus to 3.6** [`#3219`](https://github.com/OpenLineage/OpenLineage/pull/3219) [@arturowczarek](https://github.com/arturowczarek)
* **Spark: Limit the Seq size in RddPathUtils::extract()** [`#3148`](https://github.com/OpenLineage/OpenLineage/pull/3148) [@codelixir](https://github.com/codelixir)
*Adds flag to limit the logs in RddPathUtils::extract() to avoid OutOfMemoryError for large jobs.*

### Fixed
* **Docs: Fix outdated Spark-related docs** [`#3215`](https://github.com/OpenLineage/OpenLineage/pull/3215) [@mobuchowski](https://github.com/mobuchowski)
* **Fix docusaurus-mdx-checker errors** [`#3217`](https://github.com/OpenLineage/OpenLineage/pull/3217) [@arturowczarek](https://github.com/arturowczarek)
* **[Integration/dbt] Parse dbt source tests** [`#3208`](https://github.com/OpenLineage/OpenLineage/pull/3208) [@MassyB](https://github.com/MassyB)
*Consider dbt sources when looking for test results.*
* **Avoid tests in configurable test** [`#3141`](https://github.com/OpenLineage/OpenLineage/pull/3141) [@pawel-leszczynski](https://github.com/pawel-leszczynski)
Loading

0 comments on commit 41eedcc

Please sign in to comment.