Merge pull request #54 from AFD-Illinois/fix/compiles-and-docs

Touch-ups before formal release
AFD-Illinois · Dec 15, 2023 · 61299ab · 61299ab
2 parents 3deb4a7 + ed68b68
commit 61299ab
Show file tree

Hide file tree

Showing 19 changed files with 233 additions and 263 deletions.
diff --git a/.gitignore b/.gitignore
@@ -24,6 +24,7 @@ kharma_parsed_*.par
 log_*.txt
 bondi_analytic_*.txt
 atmosphere_soln_*.txt
+shock_soln_*.txt
 
 # Editor documents
 .project

diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -38,6 +38,7 @@ set(Kokkos_ENABLE_CUDA_CONSTEXPR ON CACHE BOOL "KHARMA Override")
 set(Kokkos_ENABLE_HWLOC OFF CACHE BOOL "KHARMA Override") # Possible speed improvement?
 set(Kokkos_ENABLE_AGGRESSIVE_VECTORIZATION ON CACHE BOOL "KHARMA Override")
 
+# For including the current git revision in the exe
 include(GetGitRevisionDescription)
 get_git_head_revision(GIT_REFSPEC GIT_SHA1)
 git_describe_working_tree(GIT_VERSION --tags)
@@ -53,10 +54,6 @@ else()
   include_directories(SYSTEM ${MPI_INCLUDE_PATH})
 endif()
 
-# OpenMP is usually used host-side.  We're letting Parthenon/Kokkos
-# find it though, as sometimes we require disabling it fully
-#find_package(OpenMP REQUIRED)
-
 # Build Parthenon
 add_subdirectory(external/parthenon)
 include_directories(external/parthenon/src)

diff --git a/Dockerfile b/Dockerfile
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # KHARMA
 KHARMA is an implementation of the HARM scheme for gerneral relativistic magnetohydrodynamics (GRMHD) in C++.  It is based on the Parthenon AMR infrastructure, using Kokkos for parallelism and GPU support.  It is composed of modular "packages," which in theory make it easy to add or swap components representing different physical processes.
 
-The project is capable of the same GRMHD functions found in e.g. [iharm3d](https://github.com/AFD-Illinois/iharm3d). Support for adaptive mesh refinement is planned, but not yet working for runs involving magnetic field transport.
+KHARMA is capable of closely matching other HARM implementations, e.g. [iharm3d](https://github.com/AFD-Illinois/iharm3d). However, it also extends the scheme with additional options for magnetic field transport, reconstruction, etc.  Notably, it implements a split face-centered CT scheme, allowing static and adaptive mesh refinement.
 
 ## Prerequisites
 KHARMA requires that the system have a C++17-compliant compiler, MPI, and parallel HDF5.  All other dependencies are included as submodules, and can be checked out with `git` by running
@@ -18,29 +18,22 @@ Old submodules are a common cause of compile errors!
 ## Compiling
 On directly supported systems, or systems with standard install locations, you may be able to run:
 ```bash
-./make.sh clean
+./make.sh clean [cuda hip sycl]
 ```
-And (possibly) the following to compile for GPU with CUDA:
-```bash
-./make.sh clean cuda
-```
-after a *successful* compile, subsequent invocations can omit `clean`.
+after a *successful* compile, subsequent invocations can omit `clean`.  If this command fails on supported machines (those with a file in `machines/`), please open an issue.  Broken builds aren't uncommon, as HPC machines change software all the time.
 
-If (when) these fail, take a look at the [wiki page](https://github.com/AFD-Illinois/kharma/wiki/Building-KHARMA), and the `make.sh` source code.  At worst this should involve running something like
-```bash
-PREFIX_PATH="/absolute/path/to/phdf5;/absolute/path/to/cuda" HOST_ARCH=CPUVER DEVICE_ARCH=GPUVER ./make.sh clean cuda
-```
-Where `CPUVER` and `GPUVER` are the strings used by Kokkos to denote a particular architecture & set of compile flags (Note `make.sh` needs only the portion of the flag *after* `Kokkos_ARCH_`).
+If running KHARMA on a new machine (or repairing the build on an old one), take a look at the [wiki page](https://github.com/AFD-Illinois/kharma/wiki/Building-KHARMA) describing the build system.
 
 ## Running
 Run a particular problem with e.g.
 ```bash
-$ ./kharma.host -i pars/orszag_tang.par
+$ ./kharma.host -i pars/tests/orszag_tang.par
 ```
+note that *all* options are runtime.  The single KHARMA binary can run any of the parameter files in `pars/`, and indeed this is checked as a part of the regression tests.  Note you can still disable some sub-systems manually at compile time, and of course in that case the accompanying problems will crash.
 
-KHARMA benefits from certain runtime environment variables and CPU pinning, included in a short wrapper script `run.sh`.  Note that some MPI implementations require that KHARMA be run using `mpirun`, even for a single process, and may cause errors or hangs otherwise.
+KHARMA benefits from certain runtime environment variables and CPU pinning, included in a short wrapper script `run.sh`.  This script is provided mostly as an optional convenience, and an example of how to construct your own batch scripts for running KHARMA in production.  Other example batch scripts are in the `scripts/batch/` folder.
 
-Except for performance tuning, KHARMA has no compile time parameters: all of the parameters specifying a simulation are listed in the input "deck" `problem_name.par`.  Several sample inputs corresponding to standard tests and astrophysical systems are included in `pars/`.  Further information can be found on the [wiki page](https://github.com/AFD-Illinois/kharma/wiki/Running-KHARMA).
+Further information can be found on the [wiki page](https://github.com/AFD-Illinois/kharma/wiki/Running-KHARMA).
 
 ## Hacking
 KHARMA has some preliminary documentation for developers, hosted in its GitHub [wiki](https://github.com/AFD-Illinois/kharma/wiki).
@@ -50,4 +43,4 @@ KHARMA is made available under the BSD 3-clause license included in each file an
 
 This repository also carries a substantial portion of the [Kokkos Kernels](https://github.com/kokkos/kokkos-kernels), in the directory `kharma/implicit/kokkos-kernels-pivoted`, which is provided under the license included in that directory.
 
-Submodules of this repository, [Parthenon](https://github.com/parthenon-hpc-lab/parthenon) and [mpark::variant](https://github.com/mpark/variant) are made available under their own licenses.
+Submodules of this repository are subject to their own licenses.
diff --git a/external/parthenon b/external/parthenon
diff --git a/kharma/CMakeLists.txt b/kharma/CMakeLists.txt
@@ -92,11 +92,16 @@ else()
 endif()
 
 # OPTIONS
-# These are almost universally performance trade-offs
+# These are almost universally performance trade-offs,
+# or disabling features that need dependencies.
 # TODO is there any way to make compile options less painful in CMake?
 option(FUSE_FLUX_KERNELS "Bundle the usual four flux calculation kernels (floors,R,L,apply) into one" ON)
 option(FUSE_FLOOR_KERNELS "Bundle applying the floors and ceilings into one kernel" ON)
 option(FAST_CARTESIAN "Break operation in curved spacetimes to make Cartesian Minkowski space computations faster" OFF)
+option(KHARMA_DISABLE_IMPLICIT "Disable the implicit solver, which requires bundled kokkos-kernels. Default false" OFF)
+option(KHARMA_DISABLE_CLEANUP "Disable the magnetic field cleanup module, which requires recent Parthenon. Default false" OFF)
+option(KHARMA_TRACE "Compile with tracing: print entry and exit of important functions. Default false" OFF)
+
 if(FUSE_FLUX_KERNELS)
     target_compile_definitions(${EXE_NAME} PUBLIC FUSE_FLUX_KERNELS=1)
 else()
@@ -108,13 +113,11 @@ else()
     target_compile_definitions(${EXE_NAME} PUBLIC FUSE_FLOOR_KERNELS=0)
 endif()
 if(FAST_CARTESIAN)
+    message("Compiling for Cartesian coordinates only. GRMHD will be disabled!")
     target_compile_definitions(${EXE_NAME} PUBLIC FAST_CARTESIAN=1)
 else()
     target_compile_definitions(${EXE_NAME} PUBLIC FAST_CARTESIAN=0)
 endif()
-option(KHARMA_DISABLE_IMPLICIT "Disable the implicit solver, which requires bundled kokkos-kernels. Default false" OFF)
-option(KHARMA_DISABLE_CLEANUP "Disable the magnetic field cleanup module, which requires recent Parthenon. Default false" OFF)
-option(KHARMA_TRACE "Compile with tracing: print entry and exit of important functions" OFF)
 if(KHARMA_DISABLE_IMPLICIT)
     message("Compiling without the implicit solver.  Extended GRMHD will be disabled!")
     target_compile_definitions(${EXE_NAME} PUBLIC DISABLE_IMPLICIT=1)
@@ -134,14 +137,13 @@ if(KHARMA_TRACE)
 else()
     target_compile_definitions(${EXE_NAME} PUBLIC TRACE=0)
 endif()
-option(KHARMA_DISABLE_IMPLICIT "Compile the implicit solver, requiring kokkos-kernels. Default true" OFF)
-option(KHARMA_TRACE "Compile with tracing: print entry and exit of important functions" OFF)
-if(KHARMA_DISABLE_IMPLICIT)
-    message("Compiling without the implicit solver.  Extended GRMHD will be disabled!")
-    target_compile_definitions(${EXE_NAME} PUBLIC ENABLE_IMPLICIT=0)
+if(KHARMA_DISABLE_MPI)
+    message("Compiling without MPI!")
+    target_compile_definitions(${EXE_NAME} PUBLIC ENABLE_MPI=0)
 else()
-    target_compile_definitions(${EXE_NAME} PUBLIC ENABLE_IMPLICIT=1)
+    target_compile_definitions(${EXE_NAME} PUBLIC ENABLE_MPI=1)
 endif()
+
 # FLAGS
 if(CMAKE_BUILD_TYPE)
     if(${CMAKE_BUILD_TYPE} STREQUAL "Debug")

diff --git a/kharma/boundaries/boundaries.cpp b/kharma/boundaries/boundaries.cpp
@@ -475,7 +475,7 @@ TaskStatus KBoundaries::FixFlux(MeshData<Real> *md)
 
             if (bdir > ndim) continue;
 
-            // Set ranges based
+            // Set ranges for entire width.  Probably not needed for fluxes but won't hurt
             IndexRange ib = ibe, jb = jbe, kb = kbe;
             // Range for inner_x1 bounds is first face only, etc.
             if (bdir == 1) {
@@ -520,7 +520,8 @@ TaskStatus KBoundaries::FixFlux(MeshData<Real> *md)
                         "zero_flux_" + bname, 0, F.GetDim(4) - 1, kb.s, kb.e, jb.s, jb.s, ib.s, ib.e,
                         KOKKOS_LAMBDA(const int &p, const int &k, const int &j, const int &i) {
                             F.flux(bdir, p, k, j, i) = 0.;
-                        });
+                        }
+                    );
                 }
             }
         }

diff --git a/kharma/decs.hpp b/kharma/decs.hpp
@@ -155,7 +155,11 @@ inline int MPIRank()
 }
 inline int MPIBarrier()
 {
+#if ENABLE_MPI
     return MPI_Barrier(MPI_COMM_WORLD);
+#else
+    return 0;
+#endif
 }
 
 // A few generic "NDArray" overloads for readability.

diff --git a/kharma/floors/floors_functions.hpp b/kharma/floors/floors_functions.hpp
@@ -260,31 +260,32 @@ KOKKOS_INLINE_FUNCTION int apply_floors(const GRCoordinates& G, const VariablePa
             // Update the conserved variables
             Flux::p_to_u(G, P, m_p, emhd_params, gam, k, j, i, U, m_u, loc);
         } else {
-            // Add the material in the normal observer frame, by:
-            // Adding the floors to the primitive variables
+            // Add the material in the normal observer frame.
+            // 1. Calculate how much material we're adding.
+            // This is an estimate, as it's what we'd have to do in fluid frame
             const Real rho_add    = m::max(0., rhoflr_max - rho);
             const Real u_add      = m::max(0., uflr_max - u);
             const Real uvec[NVEC] = {0}, B[NVEC] = {0};
 
-            // Calculating the corresponding conserved variables
+            // 2. Calculate the increase in conserved mass/energy corresponding to the new material.
             Real rho_ut, T[GR_DIM];
             GRMHD::p_to_u_mhd(G, rho_add, u_add, uvec, B, gam, k, j, i, rho_ut, T, loc);
 
-            // Add new conserved mass/energy to the current "conserved" state,
-            // and to the local primitives as a guess
+            // 3. Add new conserved mass/energy to the current "conserved" state.
+            // Also add to the local primitives as a guess
             P(m_p.RHO, k, j, i) += rho_add;
             P(m_p.UU, k, j, i)  += u_add;
             // Add any velocity here
             U(m_u.RHO, k, j, i) += rho_ut;
-            U(m_u.UU, k, j, i)  += T[0]; // Note this shouldn't be a single loop: m_u.U1 != m_u.UU + 1 necessarily
+            U(m_u.UU, k, j, i)  += T[0]; // Note that m_u.U1 != m_u.UU + 1 necessarily
             U(m_u.U1, k, j, i)  += T[1];
             U(m_u.U2, k, j, i)  += T[2];
             U(m_u.U3, k, j, i)  += T[3];
 
             // Recover primitive variables from conserved versions
-            // TODO selector here when we get more
+            // TODO selector here when we get more options
             Inverter::Status pflag = Inverter::u_to_p<Inverter::Type::onedw>(G, U, m_u, gam, k, j, i, P, m_p, loc);
-            // If that fails, we've effectively already applied the floors in fluid-frame to the prims,
+            // 4. If the inversion fails, we've effectively already applied the floors in fluid-frame to the prims,
             // so we just formalize that
             if (Inverter::failed(pflag)) {
                 Flux::p_to_u(G, P, m_p, emhd_params, gam, k, j, i, U, m_u, loc);

diff --git a/machines/bp.sh b/machines/bp.sh
+2 −0		CHANGELOG.md
+6 −1		benchmarks/burgers/README.md
+5 −0		benchmarks/burgers/burgers.pin
+56 −0		benchmarks/burgers/burgers_diff.py
+81 −2		benchmarks/burgers/burgers_package.cpp
+11 −1		benchmarks/burgers/burgers_package.hpp
+1 −1		example/poisson_gmg/poisson_equation.hpp
+7 −4		src/bvals/comms/bnd_info.cpp
+12 −2		src/bvals/comms/tag_map.cpp
+39 −0		src/interface/sparse_pack.hpp
+0 −2		src/outputs/outputs.cpp
+0 −1		src/solvers/bicgstab_solver.hpp
+19 −10		src/solvers/solver_utils.hpp
+2 −1		src/utils/loop_utils.hpp
+11 −0		tst/unit/test_sparse_pack.cpp