Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully Support JoinedArrays #3452

Closed
ax3l opened this issue Jan 30, 2023 · 9 comments
Closed

Fully Support JoinedArrays #3452

ax3l opened this issue Jan 30, 2023 · 9 comments
Milestone

Comments

@ax3l
Copy link
Contributor

ax3l commented Jan 30, 2023

Why is this feature important?

Many particle simulations never need collective All2All MPI comms. But for WarpX (ECP) and PIConGPU (OLCF CAAR), we need to do an All2All to tell I/O routines the global shape (and offsets) of global variables.

What is the potential impact of this feature in the community?

This would severely reduce the client-side logic before one can do particle I/O ops in MPI-parallel contexts. Particles move, so block offsets and global sizes are not universally known.

Is your feature request related to a problem? Please describe.

This feature is related to:

  • back-transformed (particle) I/O in WarpX
  • universal particle tracking workflows
  • general particle I/O for particle-based simulations

Describe the solution you'd like and potential required effort

We would like to:

Additional context

It is important that between different variables, the block order is the same. We store particles in multiple 1D variables (x,y,z,px,py,pz,id,w,...) and they need to be ordered the same in the global order to not loose their relation.

Furthermore, read and write order of blocks must be the same/deterministic, to allow checkpoint-restart workflows.

Upstream task tracking in openPMD-api: openPMD/openPMD-api#1374

Read Overhead

ADIOS2 already stores global block meta-data in a single meta-data file, so file open overhead is minimal. Processing of that meta-data cost is linear.

@ax3l
Copy link
Contributor Author

ax3l commented Jan 30, 2023

attn @pnorbert @williamfgc @eisenhauer @guj @franzpoeschel

@pnorbert
Copy link
Contributor

pnorbert commented Feb 3, 2023

PR #3466 Adds support for this in BP4.
See the /examples/basics/joinedArray example.
auto vTable = io.DefineVariable<double>("table", {adios2::JoinedDim, Ncols}, {}, {1, Ncols});
Note the empty vector for Start, and the constant adios2::JoinedDim in one of the dimensions of Shape.
When writing multiple blocks per process (per step), use SetSelection to set the Count as usual but use empty vector {} for the Start: vTable.SetSelection({{}, {Nrows, Ncols}});

@eisenhauer
Copy link
Member

PR #3488 adds Joined array support to BP5 as well...

@franzpoeschel
Copy link
Contributor

To make sure that I understand this correctly: There must be at most one joined dimension, right?
When using SetSelection, the extents of the non-joined dimension must be equal to the global extents? (i.e. there is no sub-chunking beyond the joining)?

@eisenhauer
Copy link
Member

Absolutely just one joined dimension. All the code depends upon that.

WRT sub-chunking, presumably you'd have to use a non-NULL Start value in order to specify the offset of each block in the non-joined dimension, which would require some kind of funkier logic (maybe involving bin packing?) to see how these multiple sub-chunks make up a larger chunk which would then be a component in the joined array. I don't have a clear view of how that would work. While it's not enforced, I think the only reasonably consistent semantic would be for all the blocks written to have the same Count as the Shape in the non-joined dimensions. (And maybe we should check that and throw an exception if it's not true, but we're not going that at this point (at least not in BP5).)

@franzpoeschel
Copy link
Contributor

Ok, thanks for the clarification.
I don't see a reason for using sub-chunking (for our most common use case, this will concern 1D arrays anyway), I mostly wanted to make sure what kind of ADIOS2 feature we will end up supporting precisely
.

@eisenhauer
Copy link
Member

However, @pnorbert, we do need to write documentation for this new Shape abstraction and get that in to release 2.9...

@franzpoeschel
Copy link
Contributor

franzpoeschel commented Feb 27, 2023

WIP for support of this in openPMD openPMD/openPMD-api#1382, usage example https://github.com/openPMD/openPMD-api/blob/ee4512fb01ac80da10c1bc2966a7c6aca7103de4/test/SerialIOTest.cpp#L7245

@pnorbert
Copy link
Contributor

PR #3554 adds documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants