Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate database step taking too long to complete in specfem 4.1.1 GPU version #1742

Closed
padesh opened this issue Oct 1, 2024 · 7 comments
Closed

Comments

@padesh
Copy link

padesh commented Oct 1, 2024

Hi everyone,

I recently updated specfem cartesion to 4.1.1 from 4.0.0 GPU version. In the new version, the database generation step is taking too much time as compared to previous version. I noticed most of the time is going in this step..

` ...setting up mesh adjacency

  mesh adjacency:
  total number of elements in this slice  =       126360

  maximum number of neighbors allowed     =          300
  minimum array memory required per slice =    145.089569     (MB)

  using kd-tree search radius             =    1050.00000    
           10  % - elapsed time:   59.2969856     s
           20  % - elapsed time:   118.541382     s
           30  % - elapsed time:   177.748108     s
           40  % - elapsed time:   236.961090     s
           50  % - elapsed time:   296.169128     s
           60  % - elapsed time:   355.404327     s
           70  % - elapsed time:   414.638702     s
           80  % - elapsed time:   473.861664     s
           90  % - elapsed time:   533.240967     s
          100  % - elapsed time:   592.426575     s

  maximum search elements                                      =          328
  maximum of actual search elements (after distance criterion) =          327

  estimated maximum element size            =    300.000000    
  maximum distance between neighbor centers =    984.885803    

  maximum neighbors found per element =          124
      (maximum neighbor of neighbors) =           98
  total number of neighbors           =     14527224

  Elapsed time for detection of neighbors in seconds =    593.135193    `

and this was not there in previous versions.

The same model run on CPU version takes about 10 sec for database generation using comparable compute.

Any advice how to speed it up?

Thanks.

@danielpeter
Copy link
Member

right, this new adjacency array was added to improve the accuracy of source and receivers locations. the setup takes place in the mesher (xgenerate_databases) as you noted. given the mesh setup only needs to be called once for different source/receiver setups, this is a single hit.

however, it takes quite a bit of time to set it up especially if there are a lot of elements per slice as in your case. let me see if we can further speed it up for such large mesh slices...

in the meantime, in your case you can try to run the simulation with a higher number of NPROC to cut down the number of elements per slice and improve the parallelization of this mesh adjacency setup.

@danielpeter
Copy link
Member

this has been addressed by PR #1743 - if you can, update to the devel branch version and see if the meshing is faster now for your setup.

@code-cullison
Copy link

code-cullison commented Oct 1, 2024

@danielpeter: I had the same problem, thanks for fixing this. Maybe I've misunderstood, but are the source and receiver locations not independent of the databases/mesh (as long as the sources and receivers are in the spatial domain of the model)? In other words, If I change my source/receiver locations, should I run xgenerate_databases again?

On a side note, I've noticed that the solver will change my receiver locations to a position outside (above) my mesh -- I think this is related to Issue #1621. For example, my mesh in Z goes from 0m down to -255m (~5m cell size). When I set my receiver burial/depth to -PI (-3.14... m), the solver changes their location to a depth/burial of 2e-5 (postitive Z-coord). If I put the receiver depths to 0m or -5m (multiple of cell size) then the solver doesn't change the z-coord. This doesn't happen with the source (CMTSOLUTION).

@padesh
Copy link
Author

padesh commented Oct 1, 2024

@danielpeter

Yes, it is much faster now.

` mesh adjacency:
total number of elements in this slice = 118584

  maximum number of neighbors allowed             =          300
  minimum array memory required per slice         =    136.160980     (MB)

  maximum number of elements per shared node      =            8
  node-to-element array memory required per slice =    235.257721     (MB)

           10  % - elapsed time:  0.596405625     s
           20  % - elapsed time:   1.14786458     s
           30  % - elapsed time:   1.70189250     s
           40  % - elapsed time:   2.25737548     s
           50  % - elapsed time:   2.82469821     s
           60  % - elapsed time:   3.39398766     s
           70  % - elapsed time:   3.96057320     s
           80  % - elapsed time:   4.51502466     s
           90  % - elapsed time:   5.06956339     s
          100  % - elapsed time:   5.54031801     s

  maximum neighbors found per element =           26
      (maximum neighbor of neighbors) =           98
  total number of neighbors           =     13616280

  Elapsed time for detection of neighbors in seconds =    6.20918417   `

Thanks for your prompt response and fix.

-Adesh

@danielpeter
Copy link
Member

great, glad it works - thanks for the feedback :)

@danielpeter
Copy link
Member

@danielpeter: I had the same problem, thanks for fixing this. Maybe I've misunderstood, but are the source and receiver locations not independent of the databases/mesh (as long as the sources and receivers are in the spatial domain of the model)? In other words, If I change my source/receiver locations, should I run xgenerate_databases again?

sorry, i should have rephrase that sentence. you need to run the mesher only once. there is no need to rerun the mesher when you change source/receiver positions. that's the whole point of separating mesher & solver for these SPECFEM simulations.

On a side note, I've noticed that the solver will change my receiver locations to a position outside (above) my mesh -- I think this is related to Issue #1621. For example, my mesh in Z goes from 0m down to -255m (~5m cell size). When I set my receiver burial/depth to -PI (-3.14... m), the solver changes their location to a depth/burial of 2e-5 (postitive Z-coord). If I put the receiver depths to 0m or -5m (multiple of cell size) then the solver doesn't change the z-coord. This doesn't happen with the source (CMTSOLUTION).

this sounds like a confusion about the input format of the receiver locations, in particular about the burial depth. details are described here in the wiki: https://github.com/SPECFEM/specfem3d/wiki/05_running_the_solver

note that burial depth is given in [m] and indicating depth, which is usually measured in negative Z-direction. therefore, if you set -PI as depth, it tries to locate the receiver above your surface. given you want the receiver buried below the surface, you would have +PI as burial depth.

@padesh padesh closed this as completed Oct 3, 2024
@code-cullison
Copy link

Thank's @danielpeter. I have set 'USE_SOURCES_RECEIVERS_Z = .true.' and the mesh has a negative z-axis = (0,-255), so I thought that z1,z2 would also need to be negative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants