Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OMPD] The LLVM OMPD ompd_get_task_function() entry_point address return value is 0 for the outermost implicit task of a parallel region #69

Open
jdelsign opened this issue Mar 29, 2019 · 0 comments

Comments

@jdelsign
Copy link

Using the attach test program compiled with CLANG/LLVM trunk 9 as of a week or two ago, and using an OMP and OMPD built from the "ompd-devices" branch, I ran the following test under TotalView (which has support for LLVM's OMPD library):

  • Set environment variable OMP_OMPD=on
  • Set environment variable OMP_NUM_THREADS=1
  • Set a breakpoint inside the parallel region in "case 0" in function "f" at line 109. (But note that that line actually falls inside the outlined function named "omp_outlined.debug_.10"
  • Run to the breakpoint.

Here is the (unfiltered) stack trace for the one and only thread in the program at this point:

d1.<> w
>  0 omp_outlined._debug__.10 PC=0x004015b8, FP=0x7fffffffd540 [/home/jdelsign/ms2-demo/tx_omp_parallel_nested.c#109]
   1 omp_outlined..12 PC=0x00401628, FP=0x7fffffffd570 [/home/jdelsign/ms2-demo/tx_omp_parallel_nested.c#108]
   2 __kmp_invoke_microtask PC=0x7ffff7b9a1e1, FP=0x7fffffffd610 [/.../libomp.so]
   3 __kmp_fork_call  PC=0x7ffff7b01b84, FP=0x7fffffffdac0 [/.../kmp_runtime.cpp#1909]
   4 __kmpc_fork_call PC=0x7ffff7aec09a, FP=0x7fffffffdcb0 [/.../kmp_csupport.cpp#299]
   5 f                PC=0x004012f1, FP=0x7fffffffdd20 [/home/jdelsign/ms2-demo/tx_omp_parallel_nested.c#107]
   6 main             PC=0x00401c35, FP=0x7fffffffdd90 [/home/jdelsign/ms2-demo/tx_omp_parallel_nested.c#224]
   7 __libc_start_main PC=0x7ffff72bf443, FP=0x7fffffffde50 [/lib64/libc.so.6]
   8 _start           PC=0x00400ac4, FP=0x7fffffffde58 [/amd/home/jdelsign/ms2-demo/tx_omp_parallel_nested]
d1.<>

If I use TotalView to call into the LLVM OMPD library to unwind the parallel regions, and get the task handles and task functions for each one, I see the following TotalView debug output for the OMPD DLL calls it makes.

  • Get the current parallel handle (0x3b225d0) for the thread (0x3c6a7e0):
OMPD DLL: ompd_get_curr_parallel_handle(thread_handle=0x3c6a7e0)->rc_ok: parallel_handle=0x3b225d0
  • Get the task handle (0x3d641a0) for the implicit task associated with the parallel region (0x3b225d0):
OMPD DLL: ompd_get_task_in_parallel(parallel_handle=0x3b225d0, thread_num=0)->rc_ok: task_handle=0x3d641a0
  • Get the task function for the task handle (0x3d641a0):
OMPD DLL: ompd_get_task_function(task_handle=0x3d641a0)->rc_ok: entry_point={segment=0,address=0x0}

NOTICE: The entry_point address returned is 0! Why? I would have expected that the address of "omp_outlined..12", which is 0x00401600. In fact, the compiler passes the "microtask" argument as "omp_outlined..12" into "__kmpc_fork_call" which in turn passes it into "__kmp_fork_call".

  • Release the task handle (0x3d641a0):
OMPD DLL: ompd_rel_task_handle(task_handle=0x3d641a0)->rc_ok
  • Get the enclosing parallel handle (0x3d641a0) and get its task handle and function:
OMPD DLL: ompd_get_enclosing_parallel_handle(parallel_handle=0x3b225d0)->rc_ok: enclosing_parallel_handle=0x3d641a0
OMPD DLL: ompd_get_task_in_parallel(parallel_handle=0x3d641a0, thread_num=0)->rc_ok: task_handle=0x3d960f0
OMPD DLL: ompd_get_task_function(task_handle=0x3d960f0)->rc_ok: entry_point={segment=0,address=0x0}
OMPD DLL: ompd_rel_task_handle(task_handle=0x3d960f0)->rc_ok

NOTICE: Again, 0 is returned. This one seems valid because I suspect that it's related to the "initial parallel region", and that does not correspond to any user code.

  • Get the next enclosing handle, which I assume fails because we are at the root (though the OMP 5.0 spec doesn't say what's supposed to happen in that case).
OMPD DLL: ompd_get_enclosing_parallel_handle(parallel_handle=0x3d641a0)->rc_unsupported
OMPD DLL: ompd_rel_parallel_handle(parallel_handle=0x3d641a0)->rc_ok
OMPD DLL: ompd_rel_parallel_handle(parallel_handle=0x3b225d0)->rc_ok

If I allow the program to run into more deeply nested parallel regions, I do see valid results returned from ompd_get_task_function() for the innermost parallel regions, but DLL continues to return 0 for the outermost parallel region inside of "f".

tx_omp_parallel_nested.c.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant