Skip to content

Releases: dlt-hub/dlt

0.5.2

02 Aug 19:18
e00baa0
Compare
Choose a tag to compare

Core Library

  • Add upsert merge strategy for Postgres and Snowflake, by @jorritsandbrink in #1466
  • Add basic upsert support for delta table format in filesystem destination by @jorritsandbrink in #1600
  • query tagging for snowflake by @rudolfix in #1582
  • Support Open Source ClickHouse Deployments (MergeTree engine and more) by @Pipboyguy in #1496
  • allows nested types in BigQuery via native autodetect_schema by @rudolfix in #1591
  • Enable upsert merge strategy for more SQL destinations (Athena, BigQuery, Databricks, mssql) by @jorritsandbrink in #1628
  • Fix/1512 fixes current.pipeline() access by @rudolfix in #1581
  • feat: add config dataset_name_prefix to set custom staging dataset name by @donotpush in #1563
  • fix: add airflow db reset for all tests by @donotpush in #1559
  • Enable S3 compatible storage for delta table format by @jorritsandbrink in #1586
  • feat/1495 rest_client: renames JSONResponsePaginator to JSONLinkPaginator by @willi-mueller in #1558
  • Feat/1596 adds custom config providers + example of yaml config provider supporting profiles and jinja placeholders by @rudolfix in #1642
  • Feat/1583 rest client session timeout configuration by @willi-mueller in #1590
  • Add clarification for add_limit by @VioletM in #1594
  • Fix/1606 fixes validator incremental step order to keep it always last in the pipe by @rudolfix in #1641
  • Feat/1593 rest_client: allow setting of request kwargs by @willi-mueller in #1609
  • prevent accidental wrapping of sources in resources when using adapters by @sh-rp in #1645
  • Add empty source handling for delta table format on filesystem destination by @jorritsandbrink in #1617
  • Surface original err msg from pydantic as extended_info on DataValidationError by @codingcyclist in #1569
  • fix(dockerfile): remove extra spaces around equals sign in LABEL inst… by @thisisdope in #1573
  • Qdrant uncommitted state restore and test by @steinitzu in #1545
  • fix: suppress alembic logs for tests by @donotpush in #1578

Docs

New Contributors

Full Changelog: 0.5.1...0.5.2

0.5.1

08 Jul 15:28
d1e5666
Compare
Choose a tag to compare

This is a major release (0.4 -> 0.5) in our versioning scheme so please review the breaking changes below. Most of them are relevant only for platform builders that use dlt internals. Some of the long-deprecated components were removed as well

Breaking Changes

Breaking Changes (internals)

  • if dlt.source or dlt.resource decorated function is passed a None in a default argument during a function call, it will be handled exactly like in regular Python function call. Previously such None would request argument injection from configuration. Please read more here: (#1430)
  • dlt.config.value and dlt.secrets.value were evaluating to None at runtime. Now they will evaluate to a sentinel value. All the existing code should be backward compatible. (#1430)
  • full_refresh flag of dlt.pipeline will be deprecated and replaced with dev_mode. (#1063) and (https://dlthub.com/devel/general-usage/pipeline#do-experiments-with-dev-mode)
  • the default resource extraction sequence has changed to round_robin from fifo as a default setting. You can switch back to the previous behavior and learn more about what this means here: (https://dlthub.com/docs/reference/performance#resources-extraction-fifo-vs-round-robin)
  • if you create an instance of a SPEC (ie. SnowflakeCredentials) it will not be marked as resolved even if all required fields are provided. previously some were resolving and some were not. #1489
  • parse_native_representation never marks config as resolved. previously some were resolving and some were not. #1489

Core Library

Docs

Verified Sources

We worked intensively on rest_api and sql_database:

Read more

0.4.12

29 May 13:14
b4e0491
Compare
Choose a tag to compare

Core Library

  • feat(pipeline): add an ability to auto truncate staging dataset by @IlyaFaer in #1292
  • Feat/1406 bumps duckdb 0.10 + dbt to <=1.8.x by @rudolfix in #1407
  • Azure service principal credentials support by @steinitzu in #1377
  • Support partitioning hints for athena iceberg by @steinitzu in #1403
  • Add recommended_file_size cap to limit data writer file size and cap BigQuery to 4gb by @steinitzu in #1368
  • limits mssql query size to fit network buffer to prevent errors on large inserts by @rudolfix in #1372
  • allows to bubble up exceptions when standalone resource returns by @rudolfix in #1374
  • Fix: use .get on column in mssql destination for cases where the yaml… by @Daniel-Vetter-Coverwhale in #1380
  • Make path tests Windows compatible by @jorritsandbrink in #1384
  • RESTClient: Added "values" to the data pattern of the rest_api helper by @francescomucio in #1399
  • corrects single entity path detection by @rudolfix in #1394
  • RESTClient: implement AuthConfigBase.bool + update docs by @burnash in #1413
  • Fix: ensure custom session can be provided to rest client by @z3z1ma in #1396

Docs

  • RESTClient: add an example for creating a custom POST paginator by @burnash in #1358
  • Add rest_api verified source documentation by @burnash in #1308
  • Fix typo in Slack Docs by @cybermaxs in #1369
  • RESTClient: docs: add the troubleshooting section by @burnash in #1367
  • Replace weather api example with github in create a pipeline walkthrough by @sultaniman in #1351
  • RESTClient: docs: Fixed snippet definition by @burnash in #1373
  • docs: destination tables: elaborate on example code by @burnash in #1386
  • add naming rules to contributing by @sh-rp in #1291
  • Added info about how to reorder the columns to adjust a schema by @dat-a-man in #1364
  • rest_api: add response_actions documentation by @burnash in #1362
  • Update the tutorial to use rest_client.paginate for pagination by @burnash in #1287
  • fix command to install dlt by @Benjamin0313 in #1404
  • improves sql database docs by @rudolfix in #1383
  • add typing classifier and update maintainers in pyproject by @sh-rp in #1391
  • Updated installation command in destination docs and a few others by @dat-a-man in #1410
  • Update filesystem docs with auto mkdir config by @VioletM in #1416
  • add page to docs for openapi generator by @sh-rp in #1417

New Contributors

Full Changelog: 0.4.11...0.4.12

0.4.11

14 May 16:04
aab21ba
Compare
Choose a tag to compare

Core Library

  • RESTClient: building blocks (auths, paginators, response extractors etc.) to write REST API pipelines by @burnash
  • Enable merge write disposition for athena Iceberg by @jorritsandbrink in #1315
  • adds std pipe iterator for stdout and stderr by @rudolfix in #1321
  • adds _impl_cls to dlt.resource and dynamic config section to standalone resources with dynamic names by @rudolfix in #1324
  • Accept :memory: mode for credentials parameter in duckdb factory by @sultaniman in #1297
  • allows windows native, UNC and extended paths in filesystem source and destination by @rudolfix in #1335
  • improves union validation: user friendly exceptions by @rudolfix in #1327
  • improves instantiation and shutdown of thread pools for telemetry trackers by @rudolfix in #1340
  • feat(airflow): pass data sources as callables and additional initializers for delayed source evaluation by @IlyaFaer in #1318
  • Fix: ignores table options on ALTER TABLE in BigQuery by @rudolfix in #1306
  • Fix: use correct check for column prop in column schema by @z3z1ma in #1347
  • Streamlit caching and session state store fixes by @sultaniman in #1326
  • implements method to merge columns in two table schemas by @rudolfix in #1348
  • Extend motherduck client configuration to pass custom user agent by @sultaniman in #1284
  • allows fsspec until 2023.1.0 by @rudolfix in #1305

Docs

Verified Sources

Full Changelog: 0.4.10...0.4.11

0.4.10

30 Apr 19:34
048839d
Compare
Choose a tag to compare

Core Library

  • Clickhouse destination by @Pipboyguy in #1097
  • fix(filesystem): UNC paths are supported on filesystem source and destination by @IlyaFaer in #1209
  • scd2 extension: pick your active record literal, defaults to NULL by @jorritsandbrink in #1275
  • make missing keys warning conditional on merge strategy by @jorritsandbrink in #1290
  • Fix filesystem layout timestamps with milliseconds by @sultaniman in #1286
  • fallbacks to copy on any OSError when doing hardlink by @rudolfix in #1302
  • configurable anonymous telemetry tracker by @rudolfix in #1301
  • fix athena edge case and adds layout tests for athena by @sh-rp in #1289
  • Streamlit app: do not show a notice if there is no resource state for schema by @sultaniman in #1300

Docs

Full Changelog: 0.4.9...0.4.10

0.4.9

25 Apr 05:50
efaedc2
Compare
Choose a tag to compare

Core Library

Docs

Verified Sources

New Contributors

Full Changelog: 0.4.8...0.4.9

0.4.9a2

19 Apr 08:34
Compare
Choose a tag to compare
0.4.9a2 Pre-release
Pre-release

A pre-release that allows to try out the following features and includes the following bugfixes:

Final release is scheduled for next week

0.4.8

09 Apr 13:55
c99d612
Compare
Choose a tag to compare

Core Library

  • Add Dremio as a destination by @maxfirman in #1026
  • adds a fast loading of arrow tables/pandas to postgres via COPY csv by @rudolfix in #1185
  • adds a csv writer for filesystem and postgres by @rudolfix in #1185
  • saves parquet with all logical types, spark flavor is not a default any longer by @rudolfix in #1185
    #1185
  • feat(bigquery): add streaming inserts support by @IlyaFaer in #1123
  • Feat: parameterize pipeline class in the primary factory method by @z3z1ma in #1176
  • Fix: check for typeddict before class or subclass checks which fail by @z3z1ma in #1160
  • fixes column order and add hints table variants by @rudolfix in #1127
  • fixes schema versioning by @rudolfix in #1140
  • regular initializers for credentials / config specs are type checked like dataclasses by @rudolfix in #1142
  • fix streamlit app state display: Add yaml representer for pendulum datetime by @sultaniman in #1192
  • synapse and mssql bugfixes and improvements (INSERT VALUES UNION) by @jorritsandbrink in #1174
  • various improvements to arrow table normalization by @rudolfix in #1185
  • arrow tables without rows create tables in destination by @rudolfix in #1185
  • fixes Motherduck configuration to use my_db default database and makes password / token mandatory by @rudolfix in

Docs

Verified Sources

New Contributors

Full Changelog: 0.4.7...0.4.8

0.4.7

22 Mar 07:31
be12a1c
Compare
Choose a tag to compare

Core Library

  • Custom destinations with @dlt.destination decorator by @sh-rp in #1065
  • A BigQuery custom destination supporting STRUCT data types by @sh-rp in #1107
  • Built-in Streamlit rewrite, UI improvements, dark theme a by @sultaniman in #1060
  • fixes various edge cases with Incremental data deduplication, for ordered and unordered results #971 by @rudolfix in #1062
  • Adds new dlt.mark marker to materialize table schemas without data by @rudolfix in #1122
  • validates class instances in typed dict by @rudolfix in #1082
  • feat(airflow): allow re-using sources in airflow wrapper by @IlyaFaer in #1080
  • feat(core): drop default value for write disposition by @IlyaFaer in #1057
  • splits pandas and arrow imports to fix pyarrow.compute missing by @rudolfix in #1112
  • improve no schema upgrade path exception by @sh-rp in #1125

Docs

Full Changelog: 0.4.6...0.4.7

0.4.6

06 Mar 08:03
1957384
Compare
Choose a tag to compare

Core Library

  • feat(airflow): expose the Airflow runner method to create custom DAGs by @IlyaFaer in #1014
  • removes sql alchemy dependency and port parts of URL class by @rudolfix in #1028
  • Parallelize decorator - run many regular generators in parallel by @steinitzu in #965
  • Add main entry point to support calling dlt as python module by @sultaniman in #1023

Library Bugfixes

  • fixes naive datetime bug in incremental by @rudolfix in #1020
  • Import missing pyarrow compute for transforms on arrowitems by @sh-rp in #1010
  • delete normalized package in case it already existed by @sh-rp in #1012
  • fix(core): validation error with TTableHintTemplate by @IlyaFaer in #1039
  • adds test case where payload data contains PUA unicode characters by @willi-mueller in #1053
  • fix add_limit behavior in edge cases by @sh-rp in #1052
  • adds row_order to Incremental - automatically stop taking data when out of range by @rudolfix in #1041
  • Fix to serialize load metrics as list instead of a dictionary by @sultaniman in #1051
  • fix import schema workflow by @sh-rp in #1013
  • rollback all changes to live schemas when extraction fails by @sh-rp in #1013

Docs

Verified Sources

New Contributors

Full Changelog: 0.4.5...0.4.6