Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cu2cl runs with error on ubuntu 16.04 and cuda 10.2 #4

Open
gajendraks opened this issue Feb 20, 2020 · 9 comments
Open

cu2cl runs with error on ubuntu 16.04 and cuda 10.2 #4

gajendraks opened this issue Feb 20, 2020 · 9 comments
Labels
modern Clang Issue likely related to old Clang dependency modern CUDA Issue likely related to old CUDA assumptions

Comments

@gajendraks
Copy link

Hi
We are using ubuntu 16.04 and cuda 10.2 with GeForce GTX 730.
We are trying to run your tool.

~/team5/cuda_samp/matrixadd$ cu2cl-tool matrixadd.cu -- -DGPU_ON -I /usr/local/cuda-10.2/targets/x86_64-linux/include -I . -I /usr/lib/gcc/x86_64-linux-gnu/5/include -I /usr/local/include -I /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed -I /usr/include/x86_64-linux-gnu -I /usr/include/linux &> output.txt

As our limits.h was in /usr/include/linux we included that path.
The below output file has the error generated by the tool.
output.txt

Here is the matrixadd.cu we used.
matrix_add.txt (as github didn’t allow to upload .cu file)

NVCC - VERSION
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

Could you help us in debugging this.
Thank you

@psath psath changed the title cu2cl fails to compile on ubuntu 16.04 and cuda 10.2 cu2cl runs with error on ubuntu 16.04 and cuda 10.2 Feb 20, 2020
@psath
Copy link
Contributor

psath commented Feb 20, 2020

Changed title, the tool is running, therefore it compiled successfully.

Please note the last CUDA release CU2CL was tested with was 7.5, yours is significantly newer and things have likely changed in their runtime implementation that conflict with CU2CL or Clang 3.4. You are in untested waters, but we're happy to help if we can.

Of the three errors in the attached output file, none are related to limits.h.
Your CUDA file does not use the GPU_ON macro anywhere, there is no reason to define it (it is included in the README only as an example).
The only one that might be related to CU2CL's function is the last, on line 20249.
(The first on L4382 is due to a missing type that is typically defined in the ctime header. If CUDA depends on it I am not sure why it's not including it.
The second on L7536 is a redefinition issue; the headers you are including are either not properly guarded or are somehow out of order. Rather than manually including all these search paths you should probably be using the --import-gcc-paths option that --help would have told you about.)
The third is genuinely something we haven't seen before. The execution configuration syntax (<<<grid, block>>>) is getting mapped during the pre-CU2CL Clang parsing/semantic-analysis to what the syntax effectively used to do behind the scenes- a call to cudaConfigureCall. However, this function was deprecated in CUDA 7.0 and is likely removed from the 10.2 API, causing the error you see here.

My theory is that the old Clang 3.4 may be mapping the syntax to this call transparently, despite it no longer being available, but I will need to dump an AST to test it. Our sponsorship has moved on to other projects, so I cannot give this priority attention, but I will try to get your code on some of our older CUDA installs this week or early next and try to diagnose further. Updating to a more modern Clang foundation is planned, but incomplete.

In the mean time, if you are able to install the headers from an older CUDA runtime (<=7.5) and fix the first two errors, you will likely have more success. All of these are occurring before CU2CL actually kicks in though, and the debug statements indicate CU2CL is running on the (incomplete) AST, so I am surprised it's not producing at least partial output?

@psath psath added modern Clang Issue likely related to old Clang dependency modern CUDA Issue likely related to old CUDA assumptions labels Feb 20, 2020
@gajendraks
Copy link
Author

Thanks @psath for the reply.

  • Yes GPU_ON is not required while compiling.
  • clock_t is working fine when included in normal programs.
  • After including -import-gcc-paths I started getting error as some headers files(stddef.h) not found, which exist in /usr/include/linux path, but it is searching in /usr/include. I think my glibc is not configured properly.

We are using clang 3.8 and cuda 10.2, we will try running tool on clang 3.4 and headers of cuda(<=7.5).

Could you help us in debugging the translation(below attached file).
here is the output of cu2cl on matrixadd
matrixadd.zip

Thank you🙂

@psath
Copy link
Contributor

psath commented Feb 21, 2020

The output.txt in that zip file still shows "GCC include directory import is disabled". Please provide the command line call you used and an updated log. It would likely be helpful to add the verbose -v argument to Clang to see the header search order.
Options to CU2CL (like --import-gcc-paths) need to go before the double-dash, and arguments to clang (like -v and CC macros), need to go after, which is standard for libTooling.

We do not have the resources to provide pro bono translation services. Other than the missing launch which should be resolved by using older CUDA headers, and mapping the two cudaCreateEvents to cl_events, the translation looks largely complete. It should not take significant effort to get running.

@gajendraks
Copy link
Author

Thanks @psath

I didn't update the output.txt in the above attached folder, but yes I made wrong by giving -import-gcc-paths after double-dash.

Here is the command we used this time
cu2cl-tool matrixadd.cu -import-gcc-paths -- -v &> output.txt

Output is - matrixadd.zip
But this time I didn't get any fatal error(for not including gcc header files).🙂

No problem @psath, we will try running with older CUDA headers and mapping two cudaCreateEvents to cl_events and run.

If possible
Could you explain the errors while processing(CU2CL DEBUG) in the output.txt(in the end) and also how to dump AST generated.

Thank you🙂

@psath
Copy link
Contributor

psath commented Feb 21, 2020

Glad the GCC path import resolved the first two errors!

The CU2CL DEBUG flags honestly should be silenced, but I don't seen any of particular concern in your output. They are related to propagating certain translations across the control flow graph (in particular cl_mems), i.e. if dev_a, b, and c had been declared elsewhere and passed in as function parameters, they would be mapped to cl_mems in the caller/callees, as appropriate. It's tricky to get right, especially across object file boundaries (separate ASTs), but supports compile-time type enforcement rather than relying on a bunch of explicit casts to cl_mem. To give too much technical detail, I suspect the extra chatter for "Rejected propagations" you are seeing is because the original declarations of those device buffers share the same type (i.e. are one line of int*s separated by commas; I forget the Clang internal terminology for this multiple-declaration syntax), and the CFG traversal is checking all their references once for each variable... room for internal optimization on that front if/when CU2CL gets back under active development.

Clang has as standard a mechanism to dump ASTs: https://clang.llvm.org/docs/IntroductionToTheClangAST.html. If you use different major revisions of Clang with the same CUDA headers, or different CUDA headers with the same Clang major revision, there are likely to be differences to the internal representation of the AST that matter to CU2CL's AST traversal and rewritting.

@gajendraks
Copy link
Author

Hi @psath, sorry for late reply.

In meantime we had setup an environment with cuda 7.5, clang3.4 in ubuntu14.04.

Command executed: cu2cl-tool cube_arr_ele.cu -import-gcc-paths -- -I /usr/local/cuda-7.5/targets/x86_64-linux/include -v
The output is in matrixadd.zip

Now translation is more correct than previous, as the workgroup(local/global) variables have been assigned.

Here, I am including that path as the I was getting include "cuda_runtime.h" error. even after -import-gcc-paths.

Sorry, I understood little in the first para in the last reply. And thanks for the AST dump.
I would start looking into your internal implementation and clang, then I will be in better position to proceed further. It would be great if you could give resources on this.

Should I start learning recent clang version or 3.4(which helps in better understanding of cu2cl).

And if I get any doubts on clang or cu2cl implementation. I would be dropping here.
Thankyou🙂

@psath
Copy link
Contributor

psath commented Feb 27, 2020

The attached translation has generated the kernel launch, as it should, so the old CUDA installation helped. You should be able to fix up the unimplemented mappings (cudaEvent* functions) and compile it in OpenCL. There are no errors generated, so the translation is as complete as it is going to be before you do the manual finishing work.

I don't understand your end goal, unless you are planning to make changes to CU2CL (you are welcome to fork under the existing license and submit pull requests), there is no need to "learn clang" which is a massive API..

This issue should be left specifically to the errors you are getting with modern (10.2+) CUDA headers. The workaround of installing CUDA 7.5 has been shown to work. I will leave the issue open but commentary should be specific to the issue and workaround, not wider discussion around the construction of the tool. If/when CU2CL is updated to handle modern headers the issue will be marked as "resolved by <commit SHA>" and closed and you may open a new one if problems remain.

@psath psath closed this as completed Feb 27, 2020
@psath psath reopened this Feb 27, 2020
@gajendraks
Copy link
Author

Thankyou @psath
My end-goal is to make changes in cu2cl tool

One more doubt
While running tool on the cuda-7.5 sample asyncAPI
folder: asyncAPI.zip
I changed <helper_cuda.h> to "helper_cuda.h" and <helper_functions.h> to "helper_functions.h"

/usr/local/cuda-7.5/samples/common/inc/ has helper_cuda.h and helper_functions.h

Command I ran
cu2cl-tool asyncAPI.cu -import-gcc-paths -- -I /usr/local/cuda-7.5/samples/common/inc/ -I /usr/local/cuda-7.5/targets/x86_64-linux/include/ -DGPU_ON -v >& output.txt

I see the below in output.txt
Screenshot 2020-03-23 at 5 06 22 PM

From above I can understand Unsupported CUDA calls from asyncAPI.cu, but why is it also listing Unsupported CUDA call from helper_cuda.h.

Am I running tool in wrong way again?.

Please can you check the zip file attached(asyncAPI) above and tell what's going on.And is it translating properly.
Thankyou🙂

@psath
Copy link
Contributor

psath commented Mar 24, 2020

Those are unimplemented functions mappings, which are unrelated to the CUDA 10.2 compatibility this GitHub issue is restricted to, do not derail it or it will be closed. At the time CU2CL was originally created those were not critical functions to support for our applications.

You are welcome to open a separate issue for any unsupported functions you need. If your end goal is to add to CU2CL, a good place to start would be implementing them and submitting a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modern Clang Issue likely related to old Clang dependency modern CUDA Issue likely related to old CUDA assumptions
Projects
None yet
Development

No branches or pull requests

2 participants