Skip to content

Releases: man-group/ArcticDB

v5.2.2

03 Feb 14:10
Compare
Choose a tag to compare

Fixes

Read and write snapshot metadata from the correct place PR #2161 .

Snapshot metadata is an optional structure of extra information saved with a snapshot, created when you call,

from arcticdb import Library
lib: Library

lib.snapshot("snap", metadata=["example", "metadata"])

lib.list_snapshots()
# {"snap": ["example", "metadata"]}

If you never call snapshot with metadata= then you are not affected by this issue. Metadata associated with a symbol is not affected by this issue.

Between v4.5.0 and v5.2.1 we mistakenly changed to save snapshot metadata in a new data format. This affects the snapshot metadata in the dict values returned by list_snapshots. Those versions incorrectly return None for snapshot metadata written by earlier versions, even if that metadata does exist.

This fix has us now check both data formats for snapshot metadata, so ArcticDB v5.2.2 onwards is compatible with snapshot metadata written by any version. We also revert to writing the metadata in its old location from v5.2.2 onwards. This means that ArcticDB clients between v4.5.0 and v5.2.1 will need to upgrade to be able to read snapshot metadata written by clients outside of those versions (otherwise they will see the metadata as None). ArcticDB clients older than v4.5.0 need to upgrade to v5.2.2+ to be able to read snapshot metadata written by versions between v4.5.0 and v5.2.1.


Full Changelog: v5.2.1...v5.2.2

v5.2.1

29 Jan 16:46
Compare
Choose a tag to compare

🐛 Fixes

  • Use Pandas unpickling to handle Pandas 1 vs Pandas 2 API differences better . This only affects reading Pandas structures saved as metadata with an ArcticDB symbol. (#2151)
  • Don't warn about missing keys when reading symbol ref as it is common to try to read a non-existent symbol (#2153).
    • This fixes a minor regression in v5.2.0 that caused noisy logging output like Failed to find segment for key 'r:aaa' : No response body..

Full Changelog: v5.2.0...v5.2.1


The wheels are on PyPI.

v5.2.0

27 Jan 14:54
Compare
Choose a tag to compare

🚀 Features

  • Introduce the block version ref key (#1969)
  • Add AWS STS authentication support (#1884)
  • Python 3.12 and Python 3.13 support (#1945) (#2016)
  • Numpy 2 support (#2050)
  • Reliable storage lock (#2014)
  • Storage mover (#2039)
  • Add S3 STS proxy support (#2072)
  • Implement origin for pandas resampling (#1962)
  • Refactor to storages to support async reads (#2012)
  • Chunk up incomplete segments by rows when they are staged (#2117)
  • read_batch performance improvements - now up to 10 times faster

🐛 Fixes

  • Fix decoding of fields with >2^16 blocks (#2089)
  • Handle very old normalization metadata RangeIndexes (#2118)
  • Finalize staged data memory use improvements (#2013)
  • Fix handling of empty DF in pandas 1.0 (#2010)
  • Fix string reference count leak (#1998)
  • Fix version release not attaching symbols for debug (#2018)
  • Delete staged segments after writing vref key (#2037)
  • Improve the performance of update by parallelising reads. Implement internal async update method. (#2087)
  • Performance regression when requesting a timestamp before the earliest version (#2076)
  • Using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index (#2116)
  • Notimplemented handling (#2108)
  • Fix race between list_versions and delete_snapshot on NFS (#2092)

Full Changelog: v5.1.3...v5.2.0


The wheels are on PyPI. Below are for debugging:

v5.1.3

21 Jan 11:36
Compare
Choose a tag to compare

What's Changed

Full Changelog: v5.1.2...v5.1.3

The wheels are on PyPI. Below are for debugging:

v4.4.7

08 Jan 10:58
Compare
Choose a tag to compare

What's Changed

  • Backport list versions and delete_snapshot race 8104588520 by @poodlewars in #2102

Full Changelog: v4.4.6...v4.4.7

The wheels are on PyPI. Below are for debugging:

v5.1.2

10 Dec 11:52
Compare
Choose a tag to compare

What's Changed

  • Finalize staged data memory use improvements in (#2013) 755bbda

Full Changelog: v5.1.1...v5.1.2


The wheels are on PyPI. Below are for debugging:

v5.1.1

05 Dec 15:08
Compare
Choose a tag to compare

🐛 Fixes

Full Changelog: v5.1.0...v5.1.1

The wheels are on PyPI. Below are for debugging:

v5.1.0

15 Nov 16:08
Compare
Choose a tag to compare

🚀 Features

  • Enhancement 1895: Fully parallelise processing in read_batch (#1950)

🐛 Bugfixes:

  • Fix reference counting on python strings (#1999)
  • Make snapshot names in run_scenario more unique (#1961)
  • Fix Mac ARM Build (#1960)
  • Fix delete snapshot so that it doesn't orphan data keys or delete the wrong key (#1973)
  • Bugfix 1970: Give helpful error message if and/or/not operators are provided in QueryBuilder operations (#1976)
  • Bugfix 127: Improve error message when recursively normalized metastruct is too big (#1981)
    Fix publish result curl fail (#1974)
  • Bugfix/1841/maintain empty series names (#1983)
  • Bugfix/1937/do not write append ref keys when staging incompletes (#1985)
  • Bugfix 1306: Add nonreg test for update failing with 1-nanosecond different timestamps (#1978)
  • Test meaningful error message when reading incompletes from non-existent symbol (#1991)
  • Bugfix 1655: Disallow / character in mongo lib names (#1992)

Full Changelog: v5.0.0...v5.1.0

The wheels are on PyPI. Below are for debugging:

v5.0.0

31 Oct 13:47
Compare
Choose a tag to compare

⚠️ API Changes

  • date_range returned by get_info[_batch] was a datetime.datetime now returns a pandas.Timestamp with nanosecond precision (#1461)
  • staged argument removed from write_pickle_batch (#1642)
  • Set row_count/rows to None in get_description/get_info and batch versions thereof if symbol is pickled (#1664)
  • Do not fallback to iterating snapshots if if version as_of int or timestamp not found in version chain in V2 API (#1672)
  • Use float64 as the result type for all division operations in the processing pipeline (#1794)
  • Make SymbolDescription return type match type hints (#1877)

🚀 Features

  • Lazy dataframe implementation (#1703)
  • Arbitrary clause ordering (#1860)
  • Sort staged data before writing (#1869)
  • Make library manager a LRU cache (#1930)
  • Preserve nanoseconds and timezones in date range from get info (#1461)
  • Do not fallback to iterating snapshots if if version as_of int or timestamp not found in version chain in V2 API (#1672)
  • Use float64 as the result type for all division operations in the processing pipeline (#1794)

🐛 Fixes

  • Improve full build time by 30% by adding PCH (#1764)
  • Fix typo in docs flow (#1952)
  • Fix entt dep management for conda builds (#1955)
  • Run the pull request on the PR sha with the pull_request_target event (#1954)
  • Fix the timers macro, configure it with a cmake option (#1914)
  • Small fixes for mac arm compilation and fix some tests (#1905)
  • Fixes for sort and finalize (#1763)
  • Bugfix 1552: Set row_count/rows to None in get_description/get_info and batch versions thereof if symbol is pickled (#1664)
  • Bugfix 1641: Remove unused staged arg from write_pickle_batch (#1642)
  • Refactor 1749: make processing tests great again (#1758)
  • Always use strict weak ordering in std::sort comparators (#1747)
  • Use EncodedFieldImpl.blocks() rather than casting member variable _blocks (#1745)
  • Fix batch_restore_version failing to use correct next_version_id (#1823)
  • Bugfix arcticdb-man 96: Improve error message with pd.Timedelta columns (#1874)
  • Bugfix 1865: Make SymbolDescription return type match type hints (#1877)
  • Bugfix 1652: Improve error message when object dtype columns contains Timestamps with mixed timezones (#1880)
  • Bugfix 1841: Correctly roundtrip None and empty string pd.Series names (#1878)
  • Fix library options compatibility across ArcticDB versions (#1862)
  • Bugfix 1830: fix resampling with multiindex (#1873)
Uncategorized
  • Extend tests for merge sort (#1708)
  • Check if object is an instance of QueryBuilder when comparing for equality in QueryBuilder.eq (#1754)
  • Add resampling offset (#1743)
  • Refactor/1722/remove composite from processing pipeline (#1741)
  • Fix Segment use-after-move when replicating to NFS (#1756)
  • Throw exception if appending using sort_and_finalize will create unordered index (#1760)
  • Improve logging for stress tests (#1759)
  • Remove rocksdb (#1761)
  • Update BSL table for v4.5.0 (#1768)
  • Handle KeyNotFoundException in recurse_index_key (#1766)
  • Add Storage API to check for existence of a key matching a predicate (#1762)
  • Revert "Handle KeyNotFoundException in recurse_index_key" (#1776)
  • Add 1 Billion Row Challenge Notebook (#1774)
  • Remove KeyNotFoundException catch in CopyCompressedInterStoreTask (#1778)
  • Limit 3 item ref key bypass to only when loading undeleted versions (#1775)
  • append docstring improvement: added that append on a new symbol will create it (#1787)
  • Parallelize string handling, enable python type handlers, prepare for Arrow (#1698)
  • Add interface to pass a cached entry to tombstone all (#1793)
  • Fix sort index (#1796)
  • Fix windpws CI by adding the string header in decimal.cpp (#1801)
  • Remove unused RuntimeConfig feature (#1803)
  • Add unimplemented increment counter (#1804)
  • Pre arrow refactor (#1805)
  • Fix regression in bool pickling (#1810)
  • Add API to remove Prometheus metrics (#1806)
  • Fix symbol list bug on compaction (#1324) (#1798)
  • Allow metrics to be registered twice (#1816)
  • Docs 1809: lazy dataframe documentation and example notebook (#1815)
  • Bugfix 1818: Fix QueryBuilder equality checks (#1819)
  • Support reading from prefixes that include a dot (#1820)
  • Fix compilation errors for clang 18 (#1813)
  • Add StagedDataFinalizeMethod to docs (#1744)
  • Add demo video link to README.md (#1837)
  • More sort and finalize fixes (#1799)
  • Fix conda build (#1854)
  • Fix row slicing with sort_and_finalize throw on column slicing when the column group is larger than the segment column size (#1838)
  • Don't override the https value for backwards compatibility (#1840)
  • Support the ignore missing key read option in recurse_index_keys (#1844)
  • Add flag to clear staged data on failure to both finalize functions (#1856)
  • Change test for library emptiness to allow specifying key types to exclude (#1861)
  • Fixing clean builds (#1868)
  • Extend LMDB instance lifetime (#1879)
  • Allow customizing the most useful AWS SDK settings. (#1875)
  • Refactor 1833:use entt as ecs (#1834)
  • Fix mono CI issue (#1917)
  • Fix lib_tool.read_to_keys (#1832)
  • Skip lmdb compat tests for versions <= 4.5.0 (#1923)
  • Use relative paths for compat tests (#1924)
  • ci: Use mamba 2.0 via mamba-org/[email protected] (#1855)
  • Remove unneeded step (#1929)
  • Adds a type change utility for append data keys in lib_tool (#1932)
  • fix introduction help (#1933)
  • Rewrite the mkdocs template so the links point to the correct version (#1942)
  • Change the pull_request trigger to pull_request_target (#1947)
  • Revert pull_request_target back to pull_request (#1949)

New Contributors

Full Changelog: v4.5.1...v5.0.0


The wheels are on PyPI. Below are for debugging:

v4.5.1

18 Oct 12:12
Compare
Choose a tag to compare

🐛 Fixes

Uncategorized

Full Changelog: v4.5.0...v4.5.1


The wheels are on PyPI. Below are for debugging: