Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI_INIT fails on all ranks but rank 0 when using mpi4py #17

Open
EdCaunt opened this issue Feb 19, 2024 · 1 comment
Open

MPI_INIT fails on all ranks but rank 0 when using mpi4py #17

EdCaunt opened this issue Feb 19, 2024 · 1 comment

Comments

@EdCaunt
Copy link

EdCaunt commented Feb 19, 2024

Starting a Python shell with tmpi 2 python and running from mpi4py import MPI results in a failed MPI_INIT on all ranks but the first (for any number of ranks afaict) with the following message:

Python 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from mpi4py import MPI
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_mpi_instance_init failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
[dyn3168-24:00000] *** An error occurred in MPI_Init_thread
[dyn3168-24:00000] *** reported by process [2283732993,1]
[dyn3168-24:00000] *** on a NULL communicator
[dyn3168-24:00000] *** Unknown error
[dyn3168-24:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[dyn3168-24:00000] ***    and MPI will try to terminate your MPI job as well)

Pane is dead (status 14, Mon Feb 19 14:17:33 2024)

This is on MacOS with OpenMPI 5.0.2 installed using Brew. A script containing this import works fine when run with mpiexec -n 2 --oversubscribe python script.py.

Any idea why this might be happening? The behaviour seems to be TMPI-specific. I worked initially after install, but started throwing this error, and reinstalling both OpenMPI and TMPI hasn't fixed the issue afaict.

@kristiansordal
Copy link

Having the same issue with a C project. Works fine with mpirun -n 4 xterm -e lldb <program>, aswell as running without attaching a debugger. Please let me know if you ever find a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants