Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for opening an iteration #1239

Merged
merged 6 commits into from
May 4, 2022
Merged

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Apr 1, 2022

Commit isolated from #1237

  1. When opening a new step in the Streaming API, don't flush the Series before doing that. It seems that flushing at this stage seems to confuse HDF5 and lead to rare deadlocks in parallel contexts, seen in PIConGPU. This PR is able to fix that behavior in the observed instance.
  2. When reading a variable-based Series, make sure to open an IO step before doing anything. Otherwise, attributes from later iterations might leak into the early iteration (RandomAccess mode in BP4).

TODO:

  • Testing

@ax3l
Copy link
Member

ax3l commented Apr 2, 2022

restarting CI

@ax3l ax3l closed this Apr 2, 2022
@ax3l ax3l reopened this Apr 2, 2022
@ax3l
Copy link
Member

ax3l commented Apr 3, 2022

It looks like NVHPC compilers (formerly PGI) fail on the SerialIOTest for 2 test right now.

@franzpoeschel
Copy link
Contributor Author

It looks like NVHPC compilers (formerly PGI) fail on the SerialIOTest for 2 test right now.

I can reproduce this locally, will have a look

@franzpoeschel
Copy link
Contributor Author

The NVIDIA compilers have a lot of test failures for me locally..

franzpoeschel added a commit to franzpoeschel/openPMD-api that referenced this pull request Apr 5, 2022
Don't flush when opening an iteration
@ax3l
Copy link
Member

ax3l commented Apr 6, 2022

discussed today: likely a glitch that the read toggles into write mode at some point.
will be addressed with a separate PR.

@franzpoeschel
Copy link
Contributor Author

When defining:

namespace std
{
inline string to_string(openPMD::FlushLevel const &level)
{
    switch (level)
    {
    case openPMD::FlushLevel::UserFlush:
        return "UserFlush";
    case openPMD::FlushLevel::InternalFlush:
        return "InternalFlush";
    case openPMD::FlushLevel::SkeletonOnly:
        return "SkeletonOnly";
    case openPMD::FlushLevel::CreateOrOpenFiles:
        return "CreateOrOpenFiles";
    }
}
} // namespace std

and adding to Iteration::flushGroupBased the lines

    std::cout << "Flushlevel is: " << std::endl;
    std::cout << std::to_string(flushParams.flushLevel) << std::endl;

the result is this

Flushlevel is: 
 _ xPpphh!@_!_pp؅ptime
                      P
                       `\pdte
                             !?
                               "p
imeUnitSI

         ?
          1usage informationqPI vectorЏ`ofstringsÝ =p+вBpвBp0P-l
                                                                --list-tests!"Bp-tp
                                                                                   --list-tags"Bp0P0-s  --success1"Bp0P-bp--break"Bp P0-e   --nothrowA"Bp(tabs, newlines)0P-i`
                                          --invisibles"Bp-o--outq"Bp(defaults to con0Pp-r0 
--reporter"Bp-n --name"Bp P@ -a --abort1"Bp P -x`--abortx"Bp-w!--warnq"Bp Pp!-d "
                                                                                 --durations`t durations for tests taking at least the given number of seconds%`"↑↑└␋┼↑␍┤⎼▒├␋⎺┼1◆B⎻▮P"↑°P#
                                                      ↑↑␋┼⎻┤├↑°␋┌␊◆B⎻▮P#↑##1◆B⎻⎻$B⎻@P#↑␌◆$      ↑↑⎽␊␌├␋⎺┼◆B⎻$┤├ ┴␊⎼␉⎺@P $↑┴$
                                                                                                                            ↑↑┴␊⎼␉⎺⎽
␋├≤D▮▮◆#◆B⎻⎽↑⎺┼┌≤!%@0%`Bp 0 &e order (defaults to decP0%--orderP@ Fecific seed for random numbers@`BPP˞P˞pPd@dH'``˞˞˞pP@'˞̞̞pЇ8( ̞""P̞̞̞pВ(̞͞͞p() ͞@͞͞͞p0 )͞  ͞ΞΞpp`filename8*output filenameΞPΞPΞp0 *name`Ξ%%Ξ+name(+
suite name Ϟ ϞpƝ+0ϞPϞϞϞp ĝĝ+
                            no. failuresϞϞООpĝp,
                                                warning name,enable warningsОPОPОp,yes|no`ООООpp`-secondsII0ўpўpўp`Pfilenameў""ўpP.ҞҞpҞpҞp.p..
L3pppPd@d3p@pP@p4pЇ4""0pppВ`5p5 ``p0 P6p  pp`filename6output filename00p0 @7name @%%pp7nameP7
suite namepƝ08880ppp ĝĝ8
                        no. failurespĝ 9
                                        warning name@9enable warnings00p9yes|no1@`pp:seconds1II`Pfilename2`;3@3@`A`Ap.p.x;
                                                                                                                          section na'time'|numberC&&CDDpPm@mH>yes|no DAborted (core dumped)

:D

I'll try passing by value instead of reference

@franzpoeschel
Copy link
Contributor Author

I think that the last push should fix it. Looks like a compiler issue to me, but it works now.

However, I noticed that builds with MPI or ADIOS2 have lots of failure with NVC++, we should maybe try to extend our tests there and see if it can be fixed. The thing seems to do lots of very funny optimization that may or may not result in correct binaries.

@franzpoeschel
Copy link
Contributor Author

NVHPC is now running fine again :)

@ax3l
Copy link
Member

ax3l commented Apr 6, 2022

Looks like a compiler issue to me, but it works now.

Actually, in C++, adding anything in the namespace std that is not a (class) template specialization is undefined behavior, defined by the standard. (re: #1239 (comment))

To fix the UB in the code, rely on ADL and define the to_string overload in the openPMD namespace:

It is undefined behavior to add declarations or definitions to namespace std or to any namespace nested within std, with a few exceptions noted below. [...]

@franzpoeschel
Copy link
Contributor Author

Looks like a compiler issue to me, but it works now.

Actually, in C++, adding anything in the namespace std that is not a (class) template specialization is undefined behavior, defined by the standard. (re: #1239 (comment))

To fix the UB in the code, rely on ADL and define the to_string overload in the openPMD namespace:

* https://stackoverflow.com/a/50555639/2719194

* https://en.cppreference.com/w/cpp/language/extending_std

It is undefined behavior to add declarations or definitions to namespace std or to any namespace nested within std, with a few exceptions noted below. [...]

Ah yeah, true

On the other hand, it was just some temporary debugging output and the memory issue that I saw came from somewhere else

@franzpoeschel
Copy link
Contributor Author

I think that the last push should fix it. Looks like a compiler issue to me, but it works now.

However, I noticed that builds with MPI or ADIOS2 have lots of failure with NVC++, we should maybe try to extend our tests there and see if it can be fixed. The thing seems to do lots of very funny optimization that may or may not result in correct binaries.

The Python bindings don't compile at all

@ax3l ax3l closed this Apr 11, 2022
@ax3l ax3l reopened this Apr 11, 2022
@ax3l ax3l closed this Apr 12, 2022
@ax3l ax3l reopened this Apr 12, 2022
@ax3l ax3l closed this Apr 15, 2022
@ax3l ax3l reopened this Apr 15, 2022
@ax3l
Copy link
Member

ax3l commented Apr 15, 2022

@franzpoeschel I think clang-tidy fails on this one?

@ax3l
Copy link
Member

ax3l commented Apr 29, 2022

restarting CI

@ax3l ax3l closed this Apr 29, 2022
@ax3l ax3l reopened this Apr 29, 2022
@ax3l
Copy link
Member

ax3l commented Apr 29, 2022

@franzpoeschel one tiny suggestion inline for your consideration

@ax3l
Copy link
Member

ax3l commented May 3, 2022

restarting CI since I merged a couple PRs in between

@ax3l ax3l closed this May 3, 2022
@ax3l ax3l reopened this May 3, 2022
@ax3l ax3l enabled auto-merge (squash) May 3, 2022 23:14
@ax3l ax3l merged commit 1f467a3 into openPMD:dev May 4, 2022
@ax3l ax3l modified the milestones: 0.14.5, 0.14.6 May 25, 2022
@ax3l
Copy link
Member

ax3l commented May 25, 2022

Similar to #1253 (comment) - tricky to bkacport to release-0.14.5

@ax3l ax3l modified the milestones: 0.14.6, 0.14.5 May 30, 2022
@ax3l
Copy link
Member

ax3l commented May 30, 2022

Backported via ax3l#5 to #1282 0.14.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants