-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cu2cl runs with error on ubuntu 16.04 and cuda 10.2 #4
Comments
Changed title, the tool is running, therefore it compiled successfully. Please note the last CUDA release CU2CL was tested with was 7.5, yours is significantly newer and things have likely changed in their runtime implementation that conflict with CU2CL or Clang 3.4. You are in untested waters, but we're happy to help if we can. Of the three errors in the attached output file, none are related to limits.h. My theory is that the old Clang 3.4 may be mapping the syntax to this call transparently, despite it no longer being available, but I will need to dump an AST to test it. Our sponsorship has moved on to other projects, so I cannot give this priority attention, but I will try to get your code on some of our older CUDA installs this week or early next and try to diagnose further. Updating to a more modern Clang foundation is planned, but incomplete. In the mean time, if you are able to install the headers from an older CUDA runtime (<=7.5) and fix the first two errors, you will likely have more success. All of these are occurring before CU2CL actually kicks in though, and the debug statements indicate CU2CL is running on the (incomplete) AST, so I am surprised it's not producing at least partial output? |
Thanks @psath for the reply.
We are using clang 3.8 and cuda 10.2, we will try running tool on clang 3.4 and headers of cuda(<=7.5). Could you help us in debugging the translation(below attached file). Thank you🙂 |
The output.txt in that zip file still shows "GCC include directory import is disabled". Please provide the command line call you used and an updated log. It would likely be helpful to add the verbose -v argument to Clang to see the header search order. We do not have the resources to provide pro bono translation services. Other than the missing launch which should be resolved by using older CUDA headers, and mapping the two cudaCreateEvents to cl_events, the translation looks largely complete. It should not take significant effort to get running. |
Thanks @psath I didn't update the output.txt in the above attached folder, but yes I made wrong by giving Here is the command we used this time Output is - matrixadd.zip No problem @psath, we will try running with older CUDA headers and mapping two cudaCreateEvents to cl_events and run. If possible Thank you🙂 |
Glad the GCC path import resolved the first two errors! The CU2CL DEBUG flags honestly should be silenced, but I don't seen any of particular concern in your output. They are related to propagating certain translations across the control flow graph (in particular cl_mems), i.e. if dev_a, b, and c had been declared elsewhere and passed in as function parameters, they would be mapped to cl_mems in the caller/callees, as appropriate. It's tricky to get right, especially across object file boundaries (separate ASTs), but supports compile-time type enforcement rather than relying on a bunch of explicit casts to cl_mem. To give too much technical detail, I suspect the extra chatter for "Rejected propagations" you are seeing is because the original declarations of those device buffers share the same type (i.e. are one line of int*s separated by commas; I forget the Clang internal terminology for this multiple-declaration syntax), and the CFG traversal is checking all their references once for each variable... room for internal optimization on that front if/when CU2CL gets back under active development. Clang has as standard a mechanism to dump ASTs: https://clang.llvm.org/docs/IntroductionToTheClangAST.html. If you use different major revisions of Clang with the same CUDA headers, or different CUDA headers with the same Clang major revision, there are likely to be differences to the internal representation of the AST that matter to CU2CL's AST traversal and rewritting. |
Hi @psath, sorry for late reply. In meantime we had setup an environment with cuda 7.5, clang3.4 in ubuntu14.04. Command executed: Now translation is more correct than previous, as the workgroup(local/global) variables have been assigned. Here, I am including that path as the I was getting include Sorry, I understood little in the first para in the last reply. And thanks for the AST dump. Should I start learning recent clang version or 3.4(which helps in better understanding of cu2cl). And if I get any doubts on clang or cu2cl implementation. I would be dropping here. |
The attached translation has generated the kernel launch, as it should, so the old CUDA installation helped. You should be able to fix up the unimplemented mappings (cudaEvent* functions) and compile it in OpenCL. There are no errors generated, so the translation is as complete as it is going to be before you do the manual finishing work. I don't understand your end goal, unless you are planning to make changes to CU2CL (you are welcome to fork under the existing license and submit pull requests), there is no need to "learn clang" which is a massive API.. This issue should be left specifically to the errors you are getting with modern (10.2+) CUDA headers. The workaround of installing CUDA 7.5 has been shown to work. I will leave the issue open but commentary should be specific to the issue and workaround, not wider discussion around the construction of the tool. If/when CU2CL is updated to handle modern headers the issue will be marked as "resolved by <commit SHA>" and closed and you may open a new one if problems remain. |
Thankyou @psath One more doubt
Command I ran From above I can understand Unsupported CUDA calls from Am I running tool in wrong way again?. Please can you check the zip file attached(asyncAPI) above and tell what's going on.And is it translating properly. |
Those are unimplemented functions mappings, which are unrelated to the CUDA 10.2 compatibility this GitHub issue is restricted to, do not derail it or it will be closed. At the time CU2CL was originally created those were not critical functions to support for our applications. You are welcome to open a separate issue for any unsupported functions you need. If your end goal is to add to CU2CL, a good place to start would be implementing them and submitting a pull request. |
Hi
We are using ubuntu 16.04 and cuda 10.2 with GeForce GTX 730.
We are trying to run your tool.
~/team5/cuda_samp/matrixadd$ cu2cl-tool matrixadd.cu -- -DGPU_ON -I /usr/local/cuda-10.2/targets/x86_64-linux/include -I . -I /usr/lib/gcc/x86_64-linux-gnu/5/include -I /usr/local/include -I /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed -I /usr/include/x86_64-linux-gnu -I /usr/include/linux &> output.txt
As our limits.h was in /usr/include/linux we included that path.
The below output file has the error generated by the tool.
output.txt
Here is the matrixadd.cu we used.
matrix_add.txt (as github didn’t allow to upload .cu file)
NVCC - VERSION
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
Could you help us in debugging this.
Thank you
The text was updated successfully, but these errors were encountered: