You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fewer MPI processes can also generate a segmentation fault
Expected behavior
The simulation will crash hinting that a segmentation fault has occurred.
The strerr dump from minimal.py on NESTv3.6 with 32 MPI processes is shown below:
[jsfc114:24182:0:24182] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7473)
==== backtrace (tid: 24182) ====
0 0x000000000003e6f0 __GI___sigaction() :0
1 0x0000000000655387 nest::Connector<nest::static_synapse<nest::TargetIdentifierPtrRport> >::send() ???:0
2 0x000000000043cfd6 nest::EventDeliveryManager::deliver_events_<nest::SpikeData>() event_delivery_manager.cpp:0
3 0x000000000043f29f nest::EventDeliveryManager::deliver_events() ???:0
4 0x000000000040abaa nest::SimulationManager::update_() simulation_manager.cpp:0
5 0x00000000000156e6 GOMP_parallel() /dev/shm/swmanage/jusuf/GCCcore/12.3.0/system-system/gcc-12.3.0/stage3_obj/x86_64-pc-linux-gnu/libgomp/../../../libgomp/parallel.c:178
6 0x00000000000156e6 GOMP_parallel_end() /dev/shm/swmanage/jusuf/GCCcore/12.3.0/system-system/gcc-12.3.0/stage3_obj/x86_64-pc-linux-gnu/libgomp/../../../libgomp/parallel.c:140
7 0x00000000000156e6 GOMP_parallel() /dev/shm/swmanage/jusuf/GCCcore/12.3.0/system-system/gcc-12.3.0/stage3_obj/x86_64-pc-linux-gnu/libgomp/../../../libgomp/parallel.c:179
8 0x000000000040c067 nest::SimulationManager::update_() ???:0
9 0x000000000040c96c nest::SimulationManager::call_update_() ???:0
10 0x0000000000411129 nest::SimulationManager::run() ???:0
11 0x00000000003f5d7d nest::run() ???:0
12 0x00000000003f5e51 nest::simulate() ???:0
13 0x00000000003b1836 nest::NestModule::SimulateFunction::execute() ???:0
14 0x00000000000bac21 SLIInterpreter::execute_() interpret.cc:0
15 0x0000000000030d04 __pyx_pw_12pynestkernel_10NESTEngine_9run() pynestkernel.cxx:0
16 0x00000000001d5e9c _PyEval_EvalFrameDefault() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:5225
17 0x00000000001d5e9c _PyEval_EvalFrameDefault() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:5226
18 0x00000000001ce50a _PyEval_EvalFrame() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/./Include/internal/pycore_ceval.h:73
19 0x00000000001ce50a _PyEval_Vector() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:6443
20 0x00000000001d6c3a _PyEval_EvalFrameDefault() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:5380
21 0x00000000001ce50a _PyEval_EvalFrame() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/./Include/internal/pycore_ceval.h:73
22 0x00000000001ce50a _PyEval_Vector() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:6443
23 0x00000000002562e1 PyEval_EvalCode() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/ceval.c:1154
24 0x0000000000273443 run_eval_code_obj() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/pythonrun.c:1714
25 0x000000000026fbaa run_mod() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/pythonrun.c:1735
26 0x00000000002851e1 pyrun_file() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/pythonrun.c:1630
27 0x0000000000284054 _PyRun_SimpleFileObject() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/pythonrun.c:440
28 0x0000000000283c24 _PyRun_AnyFileObject() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Python/pythonrun.c:79
29 0x000000000027df4c pymain_run_file_obj() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Modules/main.c:360
30 0x000000000027df4c pymain_run_file() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Modules/main.c:379
31 0x000000000027df4c pymain_run_python() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Modules/main.c:601
32 0x000000000027df4c Py_RunMain() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Modules/main.c:680
33 0x0000000000246c67 Py_BytesMain() /dev/shm/swmanage/jusuf/Python/3.11.3/GCCcore-12.3.0/Python-3.11.3/Modules/main.c:734
34 0x0000000000029590 __libc_start_call_main() ???:0
35 0x0000000000029640 __libc_start_main_alias_2() :0
36 0x000000000040106e _start() ???:0
=================================
<PSP:r0000028:Backtrace after SIGSEGV (Invalid memory reference):>
<PSP:r0000028:# 0: /p/software/jusuf/stages/2024/software/pscom/5-default-GCCcore-12.3.0/lib/libpscom.so.2(+0xb4e4) [0x1529ccad14e4]>
<PSP:r0000028:# 1: /usr/lib64/libc.so.6(+0x3e6f0) [0x152a4963e6f0]>
<PSP:r0000028:# 2: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest9ConnectorINS_14static_synapseINS_24TargetIdentifierPtrRportEEEE4sendEmmRKSt6vectorIPNS_14ConnectorModelESaIS7_EERNS_5EventE+0x87) [0x152a3bc69387]>
<PSP:r0000028:# 3: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(+0x43cfd6) [0x152a3ba50fd6]>
<PSP:r0000028:# 4: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest20EventDeliveryManager14deliver_eventsEm+0x6f) [0x152a3ba5329f]>
<PSP:r0000028:# 5: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(+0x40abaa) [0x152a3ba1ebaa]>
<PSP:r0000028:# 6: /p/software/jusuf/stages/2024/software/GCCcore/12.3.0/lib64/libgomp.so.1(GOMP_parallel+0x46) [0x152a406b06e6]>
<PSP:r0000028:# 7: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest17SimulationManager7update_Ev+0x197) [0x152a3ba20067]>
<PSP:r0000028:# 8: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest17SimulationManager12call_update_Ev+0x5dc) [0x152a3ba2096c]>
<PSP:r0000028:# 9: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest17SimulationManager3runERKNS_4TimeE+0x339) [0x152a3ba25129]>
<PSP:r0000028:#10: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest3runERKd+0x9d) [0x152a3ba09d7d]>
<PSP:r0000028:#11: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZN4nest8simulateERKd+0x11) [0x152a3ba09e51]>
<PSP:r0000028:#12: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libnest.so.3(_ZNK4nest10NestModule16SimulateFunction7executeEP14SLIInterpreter+0x36) [0x152a3b9c5836]>
<PSP:r0000028:#13: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/../../../nest/libsli.so.3(_ZN14SLIInterpreter8execute_Em+0x201) [0x152a3b041c21]>
<PSP:r0000028:#14: /p/software/jusuf/stages/2024/software/nest-simulator/3.6-gpsmpi-2023a/lib/python3.11/site-packages/nest/pynestkernel.so(+0x30d04) [0x152a3c1cfd04]>
<PSP:r0000028:#15: /p/software/jusuf/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so.1.0(_PyEval_EvalFrameDefault+0x41bc) [0x152a49c2ce9c]>
<PSP:r0000028:#16: /p/software/jusuf/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so.1.0(+0x1ce50a) [0x152a49c2550a]>
<PSP:r0000028:#17: /p/software/jusuf/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so.1.0(_PyEval_EvalFrameDefault+0x4f5a) [0x152a49c2dc3a]>
<PSP:r0000028:#18: /p/software/jusuf/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so.1.0(+0x1ce50a) [0x152a49c2550a]>
<PSP:r0000028:#19: /p/software/jusuf/stages/2024/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so.1.0(PyEval_EvalCode+0xa1) [0x152a49cad2e1]>
readFromPMIClient: lost connection to the PMI client
kvsprovider[23316]: releaseMySelf: wrong message type 3 (PSP_CD_CLIENTREFUSED)
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
readFromPMIClient: lost connection to the PMI client
srun: error: jsfc114: tasks 0-27,29-31: Terminated
srun: error: jsfc114: task 28: Exited with exit code 1
srun: Force Terminated StepId=659919.0
Desktop/Environment (please complete the following information):
OS: Linux 5.4.0-204-generic x86_64; HPCs (NEMO, JUSUF)
This issue has been opened in reference to a mailing list post. Details about the original post can be found here
Using structural plasticity (SP) with MPI-based simulations leads to spontaneous crashes in NESTv3.6 onward
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The simulation will crash hinting that a segmentation fault has occurred.
minimal.py
on NESTv3.6 with 32 MPI processes is shown below:Desktop/Environment (please complete the following information):
Python 3.8.10
,Python 3.9.7 :: Intel Corporation
,Python 3.12.3
The text was updated successfully, but these errors were encountered: