Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RecursiveArrayTools.jl compatibility to version 3 #2194

Merged
merged 43 commits into from
Jan 28, 2025

Conversation

huiyuxie
Copy link
Member

@huiyuxie huiyuxie commented Dec 8, 2024

Supplementary fix for #2150 and JoshuaLampert/DispersiveShallowWater.jl#163 (merged).

Copy link
Contributor

github-actions bot commented Dec 8, 2024

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

  • The PR has a single goal that is clear from the PR title and/or description.
  • All code changes represent a single set of modifications that logically belong together.
  • No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

  • The code can be understood easily.
  • Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
  • There are no redundancies that can be removed by simple modularization/refactoring.
  • There are no leftover debug statements or commented code sections.
  • The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

  • New functions and types are documented with a docstring or top-level comment.
  • Relevant publications are referenced in docstrings (see example for formatting).
  • Inline comments are used to document longer or unusual code sections.
  • Comments describe intent ("why?") and not just functionality ("what?").
  • If the PR introduces a significant change or new feature, it is documented in NEWS.md with its PR number.

Testing

  • The PR passes all tests.
  • New or modified lines of code are covered by tests.
  • New or modified tests run in less then 10 seconds.

Performance

  • There are no type instabilities or memory allocations in performance-critical parts.
  • If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

  • The correctness of the code was verified using appropriate tests.
  • If new equations/methods are added, a convergence test has been run and the results
    are posted in the PR.

Created with ❤️ by the Trixi.jl community.

@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 8, 2024

I make the initialization algorithm for DiscreteCallback default to nothing - if that is not what you intend to change with the new struct, this PR definitely won't help.

@huiyuxie huiyuxie requested a review from ranocha December 8, 2024 09:34
@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 8, 2024

This will make the current package compatible with Julia >= 1.10 - the CI tests relates to Julia < 1.10 will definitely fail.

@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 8, 2024

The dependency management is really a mess here - for example, the configurations for test env, docs env, and main project env never align. This definitely makes the debug for env configuration hard

@huiyuxie huiyuxie requested a review from sloede December 8, 2024 10:49
@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 8, 2024

Review @ranocha @sloede

@sloede
Copy link
Member

sloede commented Dec 8, 2024

This will make the current package compatible with Julia >= 1.10

I don't think we are ready to do this just yet - or did we make a decision about this that I forgot about, @ranocha?

Copy link

codecov bot commented Dec 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.42%. Comparing base (f9f1a74) to head (c0f889f).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2194   +/-   ##
=======================================
  Coverage   96.42%   96.42%           
=======================================
  Files         489      489           
  Lines       39376    39376           
=======================================
  Hits        37966    37966           
  Misses       1410     1410           
Flag Coverage Δ
unittests 96.42% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 9, 2024

Then the issue #1789 will be delayed - when do you plan to upgrade to Julia 1.10?

@sloede
Copy link
Member

sloede commented Dec 9, 2024

Then the issue #1789 will be delayed - when do you plan to upgrade to Julia 1.10?

Which version of RecursiveArrayTools.jl is really required? Just v3 (i.e., v3.0.0 would be sufficient) or a specific, later version? With, e.g., v3.2 we would still be able to keep Julia v1.9 compatibility.

@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 9, 2024

Good question 👍let me check

@huiyuxie
Copy link
Member Author

huiyuxie commented Dec 9, 2024

Which version of RecursiveArrayTools.jl is really required?

I have no idea, I just know that >= 2.38.10 does not work and >= 3 works. Can you answer Michael's question @jlchan - I don't know which version you actually need

@ranocha
Copy link
Member

ranocha commented Dec 10, 2024

I would hesitate to require Julia 1.10 and newer. We have seen some performance impacts in HPC settings, e.g., JuliaLang/julia#55009, JuliaLang/julia#50985

@jlchan
Copy link
Contributor

jlchan commented Dec 10, 2024

Which version of RecursiveArrayTools.jl is really required?

I have no idea, I just know that >= 2.38.10 does not work and >= 3 works. Can you answer Michael's question @jlchan - I don't know which version you actually need

We'd need at least RecursiveArrayTools.jl 3.27.1, which is where @huiyuxie's PR fixing VectorOfArray was implemented.

@sloede
Copy link
Member

sloede commented Dec 10, 2024

Which version of RecursiveArrayTools.jl is really required?

I have no idea, I just know that >= 2.38.10 does not work and >= 3 works. Can you answer Michael's question @jlchan - I don't know which version you actually need

We'd need at least RecursiveArrayTools.jl 3.27.1, which is where @huiyuxie's PR fixing VectorOfArray was implemented.

Too bad. This is already at Julia v1.10 😞

@huiyuxie
Copy link
Member Author

I would hesitate to require Julia 1.10 and newer. We have seen some performance impacts in HPC settings, e.g., JuliaLang/julia#55009, JuliaLang/julia#50985

Does it mean you prefer to upgrade to version 1.10 until both of these issues are resolved?

@ranocha
Copy link
Member

ranocha commented Dec 11, 2024

That's something we need to discuss.

@JoshuaLampert
Copy link
Member

This is a bad situation. IMHO, we cannot wait until these two issues are resolved before we fix the incompatibility with newer versions of the whole SciML stack. Fixing this feels more and more urgent and it also looks like the two julia issues will not be fixed anytime soon.
Is there any chance to keep the compat bound as v1.8 for julia, but also allow old and new versions of RecursiveArrayTools.jl (something like `RecursiveArrayTools = "2.38.10, 3")? Then one would need to implement different behavior depending on the version of RecursiveArrayTools.jl, i.e. for RecursiveArrayTools.jl < v3.27.1 we use the current version and for RecursiveArrayTools.jl >= v3.27.1 we implement the new version. In that case for julia < v1.10 the old version would be used and for julia >= v1.10 the new one. However, if that is really practically viable is another question. Do you see any major problem that rules this solution out?

@ranocha
Copy link
Member

ranocha commented Jan 15, 2025

Based on the results

MPI                                                                  |   76     8     84  23m41.3s
  TreeMesh MPI                                                       |   20           20   6m17.3s
  P4estMesh MPI 2D                                                   |   14           14   2m21.1s
  T8codeMesh MPI 2D                                                  |   14           14   1m41.7s
  P4estMesh MPI 3D                                                   |   17     5     22   8m34.6s
    Examples 3D                                                      |   17     5     22   8m34.6s
      elixir_advection_basic.jl                                      |    2            2   1m06.6s
      elixir_advection_amr.jl                                        |    1     1      2     56.7s
      elixir_advection_amr_unstructured_curved.jl                    |    1     1      2   1m12.4s
      elixir_advection_restart.jl                                    |    2            2      1.7s
      elixir_advection_cubed_sphere.jl                               |    2            2      6.4s
      elixir_euler_source_terms_nonconforming_unstructured_curved.jl |    1     1      2     49.3s
      elixir_euler_source_terms_nonperiodic.jl                       |    2            2     42.0s
      elixir_euler_ec.jl                                             |    2            2     59.6s
      elixir_euler_source_terms_nonperiodic_hohqmesh.jl              |    2            2     46.7s
      elixir_mhd_alfven_wave_nonconforming.jl                        |    1     1      2   1m45.7s
  T8codeMesh MPI 3D                                                  |   11     3     14   4m45.1s
    Examples 3D                                                      |   11     3     14   4m45.1s
      elixir_advection_basic.jl                                      |    2            2     45.1s
      elixir_advection_amr.jl                                        |    1     1      2     50.9s
      elixir_advection_amr_unstructured_curved.jl                    |    1     1      2     54.4s
      elixir_advection_restart.jl                                    |    2            2      1.6s
      elixir_euler_source_terms_nonconforming_unstructured_curved.jl |    1     1      2     42.9s
      elixir_euler_source_terms_nonperiodic.jl                       |    2            2     34.2s
      elixir_euler_ec.jl                                             |    2            2     52.3s

I guess the issue is somewhere in the MPI mortars with p4est/t8code meshes..

@benegee
Copy link
Contributor

benegee commented Jan 15, 2025

The Makie visualization throws an error

Could be Makie.cameracontrols(ax.scene).controls.up_key now.

@ranocha
Copy link
Member

ranocha commented Jan 15, 2025

Could you please check that and push it to this branch?

@benegee
Copy link
Contributor

benegee commented Jan 16, 2025

So far I found that running this example

module TestAllocations
using Trixi
trixi_include(@__MODULE__,
              joinpath(pwd(), "../examples/t8code_3d_dgsem/elixir_advection_amr.jl"))
@unpack mesh, equations, solver, cache = semi
@show @allocated Trixi.start_mpi_send!(cache.mpi_cache, mesh, equations, solver, cache)
end

on 2 MPI ranks results in O(1e6) @allocated output on one of the ranks, while the other has O(1000). When running with 1 rank only, or when running with julia 1.11, or when using another example like elixir_advection_basic.jl, all values are in the O(1000) range.

@ranocha
Copy link
Member

ranocha commented Jan 17, 2025

@vchuravy Does the result from @benegee trigger an idea of what's wrong here?

@benegee
Copy link
Contributor

benegee commented Jan 20, 2025

This might be a standalone more minimal failing test when comparing julia 1.10 and 1.11. Could someone verify?

module TestAllocations

function viewassign!(target, source)
    @views target[1:4] .= vec(source[1, :, 1, :, 1, 1])
    return nothing
end

mock_target = Array{Float64}(undef, 30)
mock_source = fill(1.0, (1,1,1,4,4,1))

viewassign!(mock_target, mock_source)
@show @allocated viewassign!(mock_target, mock_source)

end

@ranocha
Copy link
Member

ranocha commented Jan 21, 2025

I get the following results:

Julia 1.10:

julia> @allocated viewassign!(mock_target, mock_source)
1344

Julia 1.11:

julia> @allocated viewassign!(mock_target, mock_source)
0

However, I also get the 1344 allocations on Julia 1.9... Thus, if this is the issue, I wonder why it worked before and works on main...

@benegee
Copy link
Contributor

benegee commented Jan 21, 2025

Thanks for checking!

This is what stripping down the code led me to, but apparently it is a wrong track. It is not even related to the changes in this PR anymore.

@ranocha ranocha removed the breaking label Jan 28, 2025
Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally ready to be merged when CI passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants