Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable empty string attributes #1338

Merged
merged 2 commits into from
Jan 13, 2023

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Nov 28, 2022

Close #1329.
I think the original issue why we had this restriction in the first place can be understood from the documentation of H5Tset_size in HDF5:

size must have a positive value, unless it is passed in as H5T_VARIABLE and the datatype is a string datatype.
[…]
String or character datatypes: The size set for a string datatype should include space for the null-terminator character, otherwise it will not be stored on (or retrieved from) disk. Adjusting the size of a string automatically sets the precision to 8*size.

  • HDF5 supports storing null-terminated strings as well as non-null-terminated strings, but recommends using null-terminated strings
  • Non-variable strings (which are used by openPMD) must have a positive size >0, meaning that empty strings must be null-terminated
  • openPMD so far used non-null-terminated strings, so storing empty strings was not possible

This commit switches to using null-terminated strings.

max_len = std::max(max_len, s.size());
std::unique_ptr<char[]> c_str(new char[max_len * vs.size()]);
max_len = std::max(max_len, s.size() + 1);
std::unique_ptr<char[]> c_str(new char[max_len * vs.size()]());
Copy link
Contributor Author

@franzpoeschel franzpoeschel Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtle bug fixed in HDF5IOHandler writeAttribute() method:
new char[…] does not initialize the memory, but strncopy will copy until either a NULL is found or the max_len has been reached, so the uninitialized memory is always guarded from actually being read into a std::string. Also, no segfault is triggered, since the memory left and right of the uninitialized memory is written and the virtual memory can detect no illegal access in that case.

Before this PR, a std::vector{"VECTOR", "OF", "STRINGS"} was converted to the following c_str buffer:

VECTOR0
OF0****
STRINGS

where 0 is a null terminator and * is uninitialized memory. The null terminator is missing for the longest string, all other strings have one null terminator, followed by uninitialized memory.

With this PR, it is converted to:

VECTOR00
OF000000
STRINGS0

Every line has at least one null terminator, and the rest of the line is also padded with nulls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot for the fix and explanation.

@franzpoeschel
Copy link
Contributor Author

Hmm, ADIOS1 seems to not like it. We are deprecating that anyway, but may I can find a solution even.

@franzpoeschel franzpoeschel force-pushed the enable-empty-string-attributes branch 2 times, most recently from 48b9266 to dec3c06 Compare December 6, 2022 15:37
@franzpoeschel franzpoeschel added this to the 0.15.0 milestone Dec 8, 2022
@franzpoeschel franzpoeschel force-pushed the enable-empty-string-attributes branch from dec3c06 to 2519fac Compare January 4, 2023 10:36
char const value[],
internal::SetAttributeMode setAttributeMode)
inline bool
Attributable::setAttribute(std::string const &key, char const value[])

Check notice

Code scanning / CodeQL

No raw arrays in interfaces

Raw arrays should not be used in interfaces. A container class should be used instead.
@franzpoeschel franzpoeschel force-pushed the enable-empty-string-attributes branch from 2519fac to 112a933 Compare January 4, 2023 16:00
@ax3l ax3l requested review from ax3l and eschnett January 13, 2023 17:49
@ax3l ax3l self-assigned this Jan 13, 2023
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot!

@ax3l ax3l merged commit f9663b6 into openPMD:dev Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support empty string attributes
2 participants