-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor optimizations on MooseVariableData #29697
base: next
Are you sure you want to change the base?
Conversation
Job Documentation, step Docs: sync website on d7b226f wanted to post the following: View the site here This comment will be updated on new commits. |
Job Coverage, step Generate coverage on d7b226f wanted to post the following: Framework coverage
Modules coverageCoverage did not change Full coverage reportsReports
Warnings
This comment will be updated on new commits. |
94117f9
to
aca8170
Compare
aca8170
to
f724f59
Compare
And remove a few harmless others Avoid a few size computations
f724f59
to
dd74a6c
Compare
Job Controlled app tests on dd74a6c : invalidated by @loganharbour |
Job Disable HDF5 on dd74a6c : invalidated by @loganharbour |
I'll test this on my M1 and see how it goes |
looking at the test suite, the Framework 1 and 2 recipe and Module it seems we are running:
|
I tried on my Mac M1 with mpiexec -n 10 -i simple_diffusion.i -r 7. Not much difference in total execution time |
Not sure if the communication time matters here. I can try a serial run tomorrow |
looking at more recipes, OpenMPI and ARM Mac, seems 0 change in runtime. |
I ran it with perf_graph, the memory seems to change with new commit. I think it may have a direct effect from the changes |
for the better or worse? |
Profiling reveals a factor of 2 improvement in |
2048c97
to
8dbabe3
Compare
7463d7d
to
f1052da
Compare
I did some verification about the inlining of the lambdas with our current GCC and clang with GCC is smart enough to see that the version with the lambda is exactly the same as the one without, and actually just produces a jump. Cool cool cool |
locally (M1 max) I observed 8% speedup on simple_diffusion.i with -r 7.
I am hoping someone can reproduce so we can merge this.
I doubt the test suite will show a consistent improvement but let's see with this PR's run