Releases: man-group/ArcticDB
v5.2.2
Fixes
Read and write snapshot metadata from the correct place PR #2161 .
Snapshot metadata is an optional structure of extra information saved with a snapshot, created when you call,
from arcticdb import Library
lib: Library
lib.snapshot("snap", metadata=["example", "metadata"])
lib.list_snapshots()
# {"snap": ["example", "metadata"]}
If you never call snapshot
with metadata=
then you are not affected by this issue. Metadata associated with a symbol is not affected by this issue.
Between v4.5.0 and v5.2.1 we mistakenly changed to save snapshot metadata in a new data format. This affects the snapshot metadata in the dict
values returned by list_snapshots
. Those versions incorrectly return None
for snapshot metadata written by earlier versions, even if that metadata does exist.
This fix has us now check both data formats for snapshot metadata, so ArcticDB v5.2.2 onwards is compatible with snapshot metadata written by any version. We also revert to writing the metadata in its old location from v5.2.2 onwards. This means that ArcticDB clients between v4.5.0 and v5.2.1 will need to upgrade to be able to read snapshot metadata written by clients outside of those versions (otherwise they will see the metadata as None
). ArcticDB clients older than v4.5.0 need to upgrade to v5.2.2+ to be able to read snapshot metadata written by versions between v4.5.0 and v5.2.1.
Full Changelog: v5.2.1...v5.2.2
v5.2.1
🐛 Fixes
- Use Pandas unpickling to handle Pandas 1 vs Pandas 2 API differences better . This only affects reading Pandas structures saved as
metadata
with an ArcticDB symbol. (#2151) - Don't warn about missing keys when reading symbol ref as it is common to try to read a non-existent symbol (#2153).
- This fixes a minor regression in v5.2.0 that caused noisy logging output like
Failed to find segment for key 'r:aaa' : No response body.
.
- This fixes a minor regression in v5.2.0 that caused noisy logging output like
Full Changelog: v5.2.0...v5.2.1
The wheels are on PyPI.
v5.2.0
🚀 Features
- Introduce the block version ref key (#1969)
- Add AWS STS authentication support (#1884)
- Python 3.12 and Python 3.13 support (#1945) (#2016)
- Numpy 2 support (#2050)
- Reliable storage lock (#2014)
- Storage mover (#2039)
- Add S3 STS proxy support (#2072)
- Implement origin for pandas resampling (#1962)
- Refactor to storages to support async reads (#2012)
- Chunk up incomplete segments by rows when they are staged (#2117)
read_batch
performance improvements - now up to 10 times faster
🐛 Fixes
- Fix decoding of fields with >2^16 blocks (#2089)
- Handle very old normalization metadata RangeIndexes (#2118)
- Finalize staged data memory use improvements (#2013)
- Fix handling of empty DF in pandas 1.0 (#2010)
- Fix string reference count leak (#1998)
- Fix version release not attaching symbols for debug (#2018)
- Delete staged segments after writing vref key (#2037)
- Improve the performance of update by parallelising reads. Implement internal async update method. (#2087)
- Performance regression when requesting a timestamp before the earliest version (#2076)
- Using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index (#2116)
- Notimplemented handling (#2108)
- Fix race between list_versions and delete_snapshot on NFS (#2092)
Full Changelog: v5.1.3...v5.2.0
The wheels are on PyPI. Below are for debugging:
v5.1.3
What's Changed
- Bugfix 4897570890562900007: Fix decoding of fields with >2^16 blocks by @alexowens90 in #2088
- Handle very old normalization metadata RangeIndexes (#2118) by @alexowens90 in #2120
- 5.1.x Add exception when staging with different namd indexes by @G-D-Petrov in #2126
- Update the version for upload and download artifact GH actions to v4 by @vasil-pashov in #2131
- Storage mover port (#2039) by @vasil-pashov in #2127
Full Changelog: v5.1.2...v5.1.3
The wheels are on PyPI. Below are for debugging:
v4.4.7
What's Changed
- Backport list versions and delete_snapshot race 8104588520 by @poodlewars in #2102
Full Changelog: v4.4.6...v4.4.7
The wheels are on PyPI. Below are for debugging:
v5.1.2
v5.1.1
🐛 Fixes
- Fix string reference count leak by @willdealtry in #1998
- Fix issues with backwards compatibility from 5.1 by @willdealtry in #2017
- Claim more disk space for wheel building by @phoebusm in #1994
- Make static analysis a cron job by @vasil-pashov in #2003
- more tests on read_batch() by @grusev in #1987
- fix handling of empty DF in pandas 1.0 by @grusev in #2010
- Fix static analysis cron workflow by @vasil-pashov in #2009
- Update BSL table with 5.1 by @IvoDD in #2007
- Test new manylinux by @G-D-Petrov in #2025
- Fix nfs test setup fail by @phoebusm in #2022
- Introduce the block version ref key by @poodlewars in #1969
Full Changelog: v5.1.0...v5.1.1
The wheels are on PyPI. Below are for debugging:
v5.1.0
🚀 Features
- Enhancement 1895: Fully parallelise processing in read_batch (#1950)
🐛 Bugfixes:
- Fix reference counting on python strings (#1999)
- Make snapshot names in run_scenario more unique (#1961)
- Fix Mac ARM Build (#1960)
- Fix delete snapshot so that it doesn't orphan data keys or delete the wrong key (#1973)
- Bugfix 1970: Give helpful error message if and/or/not operators are provided in QueryBuilder operations (#1976)
- Bugfix 127: Improve error message when recursively normalized metastruct is too big (#1981)
Fix publish result curl fail (#1974) - Bugfix/1841/maintain empty series names (#1983)
- Bugfix/1937/do not write append ref keys when staging incompletes (#1985)
- Bugfix 1306: Add nonreg test for update failing with 1-nanosecond different timestamps (#1978)
- Test meaningful error message when reading incompletes from non-existent symbol (#1991)
- Bugfix 1655: Disallow / character in mongo lib names (#1992)
Full Changelog: v5.0.0...v5.1.0
The wheels are on PyPI. Below are for debugging:
v5.0.0
⚠️ API Changes
date_range
returned byget_info[_batch]
was adatetime.datetime
now returns apandas.Timestamp
with nanosecond precision (#1461)staged
argument removed fromwrite_pickle_batch
(#1642)- Set row_count/rows to None in
get_description
/get_info
and batch versions thereof if symbol is pickled (#1664) - Do not fallback to iterating snapshots if if version
as_of
int or timestamp not found in version chain in V2 API (#1672) - Use
float64
as the result type for all division operations in the processing pipeline (#1794) - Make SymbolDescription return type match type hints (#1877)
🚀 Features
- Lazy dataframe implementation (#1703)
- Arbitrary clause ordering (#1860)
- Sort staged data before writing (#1869)
- Make library manager a LRU cache (#1930)
- Preserve nanoseconds and timezones in date range from get info (#1461)
- Do not fallback to iterating snapshots if if version as_of int or timestamp not found in version chain in V2 API (#1672)
- Use float64 as the result type for all division operations in the processing pipeline (#1794)
🐛 Fixes
- Improve full build time by 30% by adding PCH (#1764)
- Fix typo in docs flow (#1952)
- Fix entt dep management for conda builds (#1955)
- Run the pull request on the PR sha with the pull_request_target event (#1954)
- Fix the timers macro, configure it with a cmake option (#1914)
- Small fixes for mac arm compilation and fix some tests (#1905)
- Fixes for sort and finalize (#1763)
- Bugfix 1552: Set row_count/rows to None in get_description/get_info and batch versions thereof if symbol is pickled (#1664)
- Bugfix 1641: Remove unused staged arg from write_pickle_batch (#1642)
- Refactor 1749: make processing tests great again (#1758)
- Always use strict weak ordering in std::sort comparators (#1747)
- Use EncodedFieldImpl.blocks() rather than casting member variable _blocks (#1745)
- Fix
batch_restore_version
failing to use correct next_version_id (#1823) - Bugfix arcticdb-man 96: Improve error message with pd.Timedelta columns (#1874)
- Bugfix 1865: Make SymbolDescription return type match type hints (#1877)
- Bugfix 1652: Improve error message when object dtype columns contains Timestamps with mixed timezones (#1880)
- Bugfix 1841: Correctly roundtrip None and empty string pd.Series names (#1878)
- Fix library options compatibility across ArcticDB versions (#1862)
- Bugfix 1830: fix resampling with multiindex (#1873)
Uncategorized
- Extend tests for merge sort (#1708)
- Check if object is an instance of QueryBuilder when comparing for equality in QueryBuilder.eq (#1754)
- Add resampling offset (#1743)
- Refactor/1722/remove composite from processing pipeline (#1741)
- Fix Segment use-after-move when replicating to NFS (#1756)
- Throw exception if appending using sort_and_finalize will create unordered index (#1760)
- Improve logging for stress tests (#1759)
- Remove rocksdb (#1761)
- Update BSL table for v4.5.0 (#1768)
- Handle KeyNotFoundException in recurse_index_key (#1766)
- Add Storage API to check for existence of a key matching a predicate (#1762)
- Revert "Handle KeyNotFoundException in recurse_index_key" (#1776)
- Add 1 Billion Row Challenge Notebook (#1774)
- Remove KeyNotFoundException catch in CopyCompressedInterStoreTask (#1778)
- Limit 3 item ref key bypass to only when loading undeleted versions (#1775)
- append docstring improvement: added that append on a new symbol will create it (#1787)
- Parallelize string handling, enable python type handlers, prepare for Arrow (#1698)
- Add interface to pass a cached entry to tombstone all (#1793)
- Fix sort index (#1796)
- Fix windpws CI by adding the string header in decimal.cpp (#1801)
- Remove unused RuntimeConfig feature (#1803)
- Add unimplemented increment counter (#1804)
- Pre arrow refactor (#1805)
- Fix regression in bool pickling (#1810)
- Add API to remove Prometheus metrics (#1806)
- Fix symbol list bug on compaction (#1324) (#1798)
- Allow metrics to be registered twice (#1816)
- Docs 1809: lazy dataframe documentation and example notebook (#1815)
- Bugfix 1818: Fix QueryBuilder equality checks (#1819)
- Support reading from prefixes that include a dot (#1820)
- Fix compilation errors for clang 18 (#1813)
- Add StagedDataFinalizeMethod to docs (#1744)
- Add demo video link to README.md (#1837)
- More sort and finalize fixes (#1799)
- Fix conda build (#1854)
- Fix row slicing with sort_and_finalize throw on column slicing when the column group is larger than the segment column size (#1838)
- Don't override the https value for backwards compatibility (#1840)
- Support the ignore missing key read option in recurse_index_keys (#1844)
- Add flag to clear staged data on failure to both finalize functions (#1856)
- Change test for library emptiness to allow specifying key types to exclude (#1861)
- Fixing clean builds (#1868)
- Extend LMDB instance lifetime (#1879)
- Allow customizing the most useful AWS SDK settings. (#1875)
- Refactor 1833:use entt as ecs (#1834)
- Fix mono CI issue (#1917)
- Fix
lib_tool.read_to_keys
(#1832) - Skip lmdb compat tests for versions <= 4.5.0 (#1923)
- Use relative paths for compat tests (#1924)
- ci: Use mamba 2.0 via
mamba-org/[email protected]
(#1855) - Remove unneeded step (#1929)
- Adds a type change utility for append data keys in lib_tool (#1932)
- fix introduction help (#1933)
- Rewrite the mkdocs template so the links point to the correct version (#1942)
- Change the pull_request trigger to pull_request_target (#1947)
- Revert pull_request_target back to pull_request (#1949)
New Contributors
- @grusev made their first contribution in #1933
- @maxim-morozov made their first contribution in #1947
Full Changelog: v4.5.1...v5.0.0
The wheels are on PyPI. Below are for debugging:
v4.5.1
🐛 Fixes
- Fix bug where libraries created with 4.5.0 are not compatible with older versions by @G-D-Petrov in #1842
- Sort and finalize fixes by @vasil-pashov in #1859
- Extend LMDB instance lifetime (#1879) by @vasil-pashov in #1908
Uncategorized
Full Changelog: v4.5.0...v4.5.1
The wheels are on PyPI. Below are for debugging: