-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Jakub Dardzinski <[email protected]>
- Loading branch information
1 parent
8b3cdaf
commit b2d9e11
Showing
10 changed files
with
330 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
title: 1.23.0 | ||
sidebar_position: 9934 | ||
--- | ||
|
||
# 1.23.0 - 2024-10-04 | ||
|
||
### Added | ||
* **Java: added CompositeTransport** [`#3039`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@JDarDagran](https://github.com/JDarDagran) | ||
*This allows user to specify multiple targets to which OpenLineage events will be emitted.* | ||
* **Spark extension interfaces: support table extended sources** [`#3062`](https://github.com/OpenLineage/OpenLineage/pull/3062) [@Imbruced](https://github.com/Imbruced) | ||
*Interfaces are now able to extract lineage from Table interface, not only RelationProvider.* | ||
* **Java: added GCP Dataplex transport** [`#3043`](https://github.com/OpenLineage/OpenLineage/pull/3043) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex.* | ||
* **Java: added Google Cloud Storage transport** [`#3077`](https://github.com/OpenLineage/OpenLineage/pull/3077) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage.* | ||
* **Java: added S3 transport** [`#3129`](https://github.com/OpenLineage/OpenLineage/pull/3129) [@arturowczarek](https://github.com/arturowczarek) | ||
*S3 transport is now available as a separate Maven package for users that want to send OL events to S3.* | ||
* **Java: add option to configure client via environment variables** [`#3094`](https://github.com/OpenLineage/OpenLineage/pull/3094) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to configure client via environment variables** [`#3114`](https://github.com/OpenLineage/OpenLineage/pull/3114) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to add custom headers in HTTP transport** [`#3116`](https://github.com/OpenLineage/OpenLineage/pull/3116) [@JDarDagran](https://github.com/JDarDagran) | ||
*Allows user to add custom headers, for example for auth purposes.* | ||
* **Spec: add full dataset dependencies** [`#3097`](https://github.com/OpenLineage/OpenLineage/pull/3097) [`#3098`](https://github.com/OpenLineage/OpenLineage/pull/3098) [@arturowczarek](https://github.com/arturowczarek) | ||
*Now, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.* | ||
* **Java: OpenLineageClient and Transports are now AutoCloseable** [`#3122`](https://github.com/OpenLineage/OpenLineage/pull/3122) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*This prevents a number of issues that might be caused by not closing underlying transports.* | ||
|
||
### Fixed | ||
* **Python Facet generator does not validate optional arguments** [`#3054`](https://github.com/OpenLineage/OpenLineage/pull/3054) [@JDarDagran](https://github.com/JDarDagran) | ||
*This fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None.* | ||
* **SQL: report only actually used tables from CTEs, rather than all** [`#2962`](https://github.com/OpenLineage/OpenLineage/pull/2962) [@Imbruced](https://github.com/Imbruced) | ||
*With this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported.* | ||
* **Fluentd: Enhancing plugin's capabilities** [`#3068`](https://github.com/OpenLineage/OpenLineage/pull/3068) [@jonathanlbt1](https://github.com/jonathanlbt1) | ||
*This change enhances performance and docs of fluentd proxy plugin.* | ||
* **SQL: fix parser to point to origin table instead of CTEs** [`#3107`](https://github.com/OpenLineage/OpenLineage/pull/3107) [@Imbruced](https://github.com/Imbruced) | ||
*For some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.* | ||
* **Spark: column lineage correctly produces for merge into command** [`#3095`](https://github.com/OpenLineage/OpenLineage/pull/3095) [@Imbruced](https://github.com/Imbruced) | ||
*Now OL produces CLL correctly for the potential view in the middle.* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
title: 1.23.0 | ||
sidebar_position: 9934 | ||
--- | ||
|
||
# 1.23.0 - 2024-10-04 | ||
|
||
### Added | ||
* **Java: added CompositeTransport** [`#3039`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@JDarDagran](https://github.com/JDarDagran) | ||
*This allows user to specify multiple targets to which OpenLineage events will be emitted.* | ||
* **Spark extension interfaces: support table extended sources** [`#3062`](https://github.com/OpenLineage/OpenLineage/pull/3062) [@Imbruced](https://github.com/Imbruced) | ||
*Interfaces are now able to extract lineage from Table interface, not only RelationProvider.* | ||
* **Java: added GCP Dataplex transport** [`#3043`](https://github.com/OpenLineage/OpenLineage/pull/3043) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex.* | ||
* **Java: added Google Cloud Storage transport** [`#3077`](https://github.com/OpenLineage/OpenLineage/pull/3077) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage.* | ||
* **Java: added S3 transport** [`#3129`](https://github.com/OpenLineage/OpenLineage/pull/3129) [@arturowczarek](https://github.com/arturowczarek) | ||
*S3 transport is now available as a separate Maven package for users that want to send OL events to S3.* | ||
* **Java: add option to configure client via environment variables** [`#3094`](https://github.com/OpenLineage/OpenLineage/pull/3094) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to configure client via environment variables** [`#3114`](https://github.com/OpenLineage/OpenLineage/pull/3114) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to add custom headers in HTTP transport** [`#3116`](https://github.com/OpenLineage/OpenLineage/pull/3116) [@JDarDagran](https://github.com/JDarDagran) | ||
*Allows user to add custom headers, for example for auth purposes.* | ||
* **Spec: add full dataset dependencies** [`#3097`](https://github.com/OpenLineage/OpenLineage/pull/3097) [`#3098`](https://github.com/OpenLineage/OpenLineage/pull/3098) [@arturowczarek](https://github.com/arturowczarek) | ||
*Now, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.* | ||
* **Java: OpenLineageClient and Transports are now AutoCloseable** [`#3122`](https://github.com/OpenLineage/OpenLineage/pull/3122) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*This prevents a number of issues that might be caused by not closing underlying transports.* | ||
|
||
### Fixed | ||
* **Python Facet generator does not validate optional arguments** [`#3054`](https://github.com/OpenLineage/OpenLineage/pull/3054) [@JDarDagran](https://github.com/JDarDagran) | ||
*This fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None.* | ||
* **SQL: report only actually used tables from CTEs, rather than all** [`#2962`](https://github.com/OpenLineage/OpenLineage/pull/2962) [@Imbruced](https://github.com/Imbruced) | ||
*With this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported.* | ||
* **Fluentd: Enhancing plugin's capabilities** [`#3068`](https://github.com/OpenLineage/OpenLineage/pull/3068) [@jonathanlbt1](https://github.com/jonathanlbt1) | ||
*This change enhances performance and docs of fluentd proxy plugin.* | ||
* **SQL: fix parser to point to origin table instead of CTEs** [`#3107`](https://github.com/OpenLineage/OpenLineage/pull/3107) [@Imbruced](https://github.com/Imbruced) | ||
*For some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.* | ||
* **Spark: column lineage correctly produces for merge into command** [`#3095`](https://github.com/OpenLineage/OpenLineage/pull/3095) [@Imbruced](https://github.com/Imbruced) | ||
*Now OL produces CLL correctly for the potential view in the middle.* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: 1.24.2 | ||
sidebar_position: 9933 | ||
--- | ||
|
||
# 1.24.2 - 2024-11-05 | ||
|
||
### Added | ||
* **Spark: Add Dataproc run facet to include jobType property** [`#3167`](https://github.com/OpenLineage/OpenLineage/pull/3167) [@codelixir](https://github.com/codelixir) | ||
*Updates the GCP Dataproc run facet to include jobType property.* | ||
* **Add EnvironmentVariablesRunFacet to core spec** [`#3186`](https://github.com/OpenLineage/OpenLineage/pull/3186) [@JDarDagran](https://github.com/JDarDagran) | ||
*Additionally, directly use EnvironmentVariablesRunFacet in Python client.* | ||
* **Add assertions for format in test events** [`#3221`](https://github.com/OpenLineage/OpenLineage/pull/3221) [@JDarDagran](https://github.com/JDarDagran) | ||
* **Spark: Add integration tests for EMR** [`#3142`](https://github.com/OpenLineage/OpenLineage/pull/3142) [@arturowczarek](https://github.com/arturowczarek) | ||
*Spark integration has integration tests for EMR.* | ||
|
||
### Changed | ||
* **Move Kinesis to separate module, migrate HTTP transport to httpclient5** [`#3205`](https://github.com/OpenLineage/OpenLineage/pull/3205) [@mobuchowski](https://github.com/mobuchowski) | ||
*Moves Kinesis integration to a separate module and updates HTTP transport to use HttpClient 5.x.* | ||
* **Docs: Upgrade docusaurus to 3.6** [`#3219`](https://github.com/OpenLineage/OpenLineage/pull/3219) [@arturowczarek](https://github.com/arturowczarek) | ||
* **Spark: Limit the Seq size in RddPathUtils::extract()** [`#3148`](https://github.com/OpenLineage/OpenLineage/pull/3148) [@codelixir](https://github.com/codelixir) | ||
*Adds flag to limit the logs in RddPathUtils::extract() to avoid OutOfMemoryError for large jobs.* | ||
|
||
### Fixed | ||
* **Docs: Fix outdated Spark-related docs** [`#3215`](https://github.com/OpenLineage/OpenLineage/pull/3215) [@mobuchowski](https://github.com/mobuchowski) | ||
* **Fix docusaurus-mdx-checker errors** [`#3217`](https://github.com/OpenLineage/OpenLineage/pull/3217) [@arturowczarek](https://github.com/arturowczarek) | ||
* **[Integration/dbt] Parse dbt source tests** [`#3208`](https://github.com/OpenLineage/OpenLineage/pull/3208) [@MassyB](https://github.com/MassyB) | ||
*Consider dbt sources when looking for test results.* | ||
* **Avoid tests in configurable test** [`#3141`](https://github.com/OpenLineage/OpenLineage/pull/3141) [@pawel-leszczynski](https://github.com/pawel-leszczynski) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
title: 1.23.0 | ||
sidebar_position: 9934 | ||
--- | ||
|
||
# 1.23.0 - 2024-10-04 | ||
|
||
### Added | ||
* **Java: added CompositeTransport** [`#3039`](https://github.com/OpenLineage/OpenLineage/pull/2944) [@JDarDagran](https://github.com/JDarDagran) | ||
*This allows user to specify multiple targets to which OpenLineage events will be emitted.* | ||
* **Spark extension interfaces: support table extended sources** [`#3062`](https://github.com/OpenLineage/OpenLineage/pull/3062) [@Imbruced](https://github.com/Imbruced) | ||
*Interfaces are now able to extract lineage from Table interface, not only RelationProvider.* | ||
* **Java: added GCP Dataplex transport** [`#3043`](https://github.com/OpenLineage/OpenLineage/pull/3043) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex.* | ||
* **Java: added Google Cloud Storage transport** [`#3077`](https://github.com/OpenLineage/OpenLineage/pull/3077) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage.* | ||
* **Java: added S3 transport** [`#3129`](https://github.com/OpenLineage/OpenLineage/pull/3129) [@arturowczarek](https://github.com/arturowczarek) | ||
*S3 transport is now available as a separate Maven package for users that want to send OL events to S3.* | ||
* **Java: add option to configure client via environment variables** [`#3094`](https://github.com/OpenLineage/OpenLineage/pull/3094) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to configure client via environment variables** [`#3114`](https://github.com/OpenLineage/OpenLineage/pull/3114) [@JDarDagran](https://github.com/JDarDagran) | ||
*Specified variables are now autotranslated to configuration values.* | ||
* **Python: add option to add custom headers in HTTP transport** [`#3116`](https://github.com/OpenLineage/OpenLineage/pull/3116) [@JDarDagran](https://github.com/JDarDagran) | ||
*Allows user to add custom headers, for example for auth purposes.* | ||
* **Spec: add full dataset dependencies** [`#3097`](https://github.com/OpenLineage/OpenLineage/pull/3097) [`#3098`](https://github.com/OpenLineage/OpenLineage/pull/3098) [@arturowczarek](https://github.com/arturowczarek) | ||
*Now, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.* | ||
* **Java: OpenLineageClient and Transports are now AutoCloseable** [`#3122`](https://github.com/OpenLineage/OpenLineage/pull/3122) [@ddebowczyk92](https://github.com/ddebowczyk92) | ||
*This prevents a number of issues that might be caused by not closing underlying transports.* | ||
|
||
### Fixed | ||
* **Python Facet generator does not validate optional arguments** [`#3054`](https://github.com/OpenLineage/OpenLineage/pull/3054) [@JDarDagran](https://github.com/JDarDagran) | ||
*This fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None.* | ||
* **SQL: report only actually used tables from CTEs, rather than all** [`#2962`](https://github.com/OpenLineage/OpenLineage/pull/2962) [@Imbruced](https://github.com/Imbruced) | ||
*With this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported.* | ||
* **Fluentd: Enhancing plugin's capabilities** [`#3068`](https://github.com/OpenLineage/OpenLineage/pull/3068) [@jonathanlbt1](https://github.com/jonathanlbt1) | ||
*This change enhances performance and docs of fluentd proxy plugin.* | ||
* **SQL: fix parser to point to origin table instead of CTEs** [`#3107`](https://github.com/OpenLineage/OpenLineage/pull/3107) [@Imbruced](https://github.com/Imbruced) | ||
*For some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.* | ||
* **Spark: column lineage correctly produces for merge into command** [`#3095`](https://github.com/OpenLineage/OpenLineage/pull/3095) [@Imbruced](https://github.com/Imbruced) | ||
*Now OL produces CLL correctly for the potential view in the middle.* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: 1.24.2 | ||
sidebar_position: 9933 | ||
--- | ||
|
||
# 1.24.2 - 2024-11-05 | ||
|
||
### Added | ||
* **Spark: Add Dataproc run facet to include jobType property** [`#3167`](https://github.com/OpenLineage/OpenLineage/pull/3167) [@codelixir](https://github.com/codelixir) | ||
*Updates the GCP Dataproc run facet to include jobType property.* | ||
* **Add EnvironmentVariablesRunFacet to core spec** [`#3186`](https://github.com/OpenLineage/OpenLineage/pull/3186) [@JDarDagran](https://github.com/JDarDagran) | ||
*Additionally, directly use EnvironmentVariablesRunFacet in Python client.* | ||
* **Add assertions for format in test events** [`#3221`](https://github.com/OpenLineage/OpenLineage/pull/3221) [@JDarDagran](https://github.com/JDarDagran) | ||
* **Spark: Add integration tests for EMR** [`#3142`](https://github.com/OpenLineage/OpenLineage/pull/3142) [@arturowczarek](https://github.com/arturowczarek) | ||
*Spark integration has integration tests for EMR.* | ||
|
||
### Changed | ||
* **Move Kinesis to separate module, migrate HTTP transport to httpclient5** [`#3205`](https://github.com/OpenLineage/OpenLineage/pull/3205) [@mobuchowski](https://github.com/mobuchowski) | ||
*Moves Kinesis integration to a separate module and updates HTTP transport to use HttpClient 5.x.* | ||
* **Docs: Upgrade docusaurus to 3.6** [`#3219`](https://github.com/OpenLineage/OpenLineage/pull/3219) [@arturowczarek](https://github.com/arturowczarek) | ||
* **Spark: Limit the Seq size in RddPathUtils::extract()** [`#3148`](https://github.com/OpenLineage/OpenLineage/pull/3148) [@codelixir](https://github.com/codelixir) | ||
*Adds flag to limit the logs in RddPathUtils::extract() to avoid OutOfMemoryError for large jobs.* | ||
|
||
### Fixed | ||
* **Docs: Fix outdated Spark-related docs** [`#3215`](https://github.com/OpenLineage/OpenLineage/pull/3215) [@mobuchowski](https://github.com/mobuchowski) | ||
* **Fix docusaurus-mdx-checker errors** [`#3217`](https://github.com/OpenLineage/OpenLineage/pull/3217) [@arturowczarek](https://github.com/arturowczarek) | ||
* **[Integration/dbt] Parse dbt source tests** [`#3208`](https://github.com/OpenLineage/OpenLineage/pull/3208) [@MassyB](https://github.com/MassyB) | ||
*Consider dbt sources when looking for test results.* | ||
* **Avoid tests in configurable test** [`#3141`](https://github.com/OpenLineage/OpenLineage/pull/3141) [@pawel-leszczynski](https://github.com/pawel-leszczynski) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
title: 1.25.0 | ||
sidebar_position: 9932 | ||
--- | ||
|
||
# 1.25.0 - 2024-12-03 | ||
|
||
### Added | ||
* **Dbt: Add support for Column-Level Lineage in dbt integration.** [`#3264`](https://github.com/OpenLineage/OpenLineage/pull/3264) [@mayurmadnani](https://github.com/mayurmadnani) | ||
*Dbt integration now uses SQL parser to add information about collected column-level lineage.* | ||
* **Spark: Add input and output statistics about datasets read and written.** [`#3240`](https://github.com/OpenLineage/OpenLineage/pull/3240)[`#3263`](https://github.com/OpenLineage/OpenLineage/pull/3263) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) | ||
*Fix issues related to existing output statistics collection mechanism and fetch input statistics. Output statistics contain now amount of files written, bytes size as well as records written. Input statistics contain bytes size and number of files read, while record count is collected only for DataSourceV2 sources.* | ||
* **Introduced InputStatisticsInputDatasetFacet** [`#3238`](https://github.com/OpenLineage/OpenLineage/pull/3238) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) | ||
*Extend spec with a new facet InputStatisticsInputDatasetFacet modelled after a similar OutputStatisticsOutputDatasetFacet to contain statistics about input dataset read by a job.* | ||
|
||
### Changed | ||
* **Spark: Exclude META-INF/\*TransportBuilder from Spark Extension Interfaces** [`#3244`](https://github.com/OpenLineage/OpenLineage/pull/3244) [@tnazarew](https://github.com/tnazarew) | ||
*Excludes META-INF/\*TransportBuilder to avoid version conflicts.* | ||
* **Spark: enables building input/output facets through `DatasetFactory`** [`#3207`](https://github.com/OpenLineage/OpenLineage/pull/3207) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) | ||
*Adds extra capabilities into `DatasetFactory` class, marks some public developers' API methods as deprecated.* | ||
|
||
### Fixed | ||
|
||
* **dbt: fix compatibility with dbt v1.8** [`#3228`](https://github.com/OpenLineage/OpenLineage/pull/3228) [@NJA010](https://github.com/NJA010) | ||
*dbt integration now takes into account modified `test_metadata` field.* | ||
* **Spark: enabled Delta 3.x version compatibility** [`#3253`](https://github.com/OpenLineage/OpenLineage/pull/3253) [@Jorricks](https://github.com/Jorricks) | ||
*Take into account modified initialSnapshot name.* |
Oops, something went wrong.