-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make discovery mechanism for cuda/_include
directory compatible with pip install --editable
#2846
Conversation
🟩 CI finished in 15m 57s: Pass: 100%/1 | Total: 15m 57s | Avg: 15m 57s | Max: 15m 57s
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
CUDA Experimental | |
+/- | python |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
CUDA Experimental | |
+/- | python |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 1)
# | Runner |
---|---|
1 | linux-amd64-gpu-v100-latest-1 |
include_path = importlib.resources.files('cuda').joinpath('_include') | ||
include_path = importlib.resources.files( | ||
'cuda.parallel').parent.joinpath('_include') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: where would this new path locates? site-packages/cuda/parallel/_include
or site-packages/cuda/_include
? Asking because the plan (#2281) was to share the headers between cuda.parallel and cuda.cooperative, and eventually ship a standalone CCCL wheel that's header-only (but not using, e.g., https://pypi.org/project/nvidia-cuda-cccl-cu12/).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: where would this new path locates?
It's not a new path, it's really only the discovery mechanism that's changed in this PR.
Asking because the plan (#2281) was to share the headers between cuda.parallel and cuda.cooperative
Yes, in the regular wheel (without --editable
) the directory is (e.g. in my CCCL Dev Container and a devenv
virtual environment):
/home/coder/cccl/python/devenv/lib/python3.12/site-packages/cuda/_include
That is shared between cuda.parallel and cuda.cooperative.
But this intermediate artifact from running pip3 install .[test]
also exists:
/home/coder/cccl/python/cuda_parallel/cuda/_include/
When using --editable
, pip does not make a copy of _include
in site-packages/cuda/
But site-packages/cuda
does exist, therefore
importlib.resources.files('cuda').joinpath('_include')
produces site-packages/cuda/_include
even though that does not exist.
When using
importlib.resources.files('cuda.parallel').parent.joinpath('_include')
the Python import is forced to find cuda.parallel
, which does not exist in site-packages/cuda
if --editable
is used. The --editable
feature only drops some "dynamic import forwarding code" (my hand-wavy description) in site-packages/cuda
, which finds cuda.parallel
in the source directory. Conveniently, _include
happens to be in the right place there, too, so the puzzle pieces nicely fall into place.
It took me almost a couple hours yesterday to figure this out btw. 😅
Caveat: I've not tried to pip install cuda.cooperative into the same devenv, but I don't think this PR changes anything if --editable
is not used.
@@ -112,6 +112,7 @@ def build_extension(self, ext): | |||
extras_require={ | |||
"test": [ | |||
"pytest", | |||
"pytest-xdist", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: be careful if we have large-size tests, because GPUs can easily go OOM if tests are running in parallel. In cuQuantum we had this enabled, but at QA time we found this often causes false positive on small GPUs like T4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the warning. In automated testing (e.g. GitHub Actions) I'd add the -n
option only when we really need it. In interactive testing it makes a big difference. I posted the timings in the PR description. For cuda_cooperative, pytest -n 16
finishes after 27s, while it takes 322s with only one worker.
cuda/_include
directory compatible with pip install --editable
cuda/_include
directory compatible with pip install --editable
🟩 CI finished in 15m 12s: Pass: 100%/1 | Total: 15m 12s | Avg: 15m 12s | Max: 15m 12s
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
CUDA Experimental | |
+/- | python |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
CUDA Experimental | |
+/- | python |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 1)
# | Runner |
---|---|
1 | linux-amd64-gpu-v100-latest-1 |
…h `pip install --editable` (NVIDIA#2846) * Make discovery mechanism for cuda/_include directory compatible with `pip install --editable` * Add pytest-xdist to test requirements. * Apply the same `.parent` trick to cuda_cooperative, as suggested by @gevtushenko
Description
Also add pytest-xdist (pytest plugin for distributed testing) to test requirements. This speeds up iterative development:
cuda_cooperative:
pytest
default (one worker): 322.17spytest -n 16
: 27.17scuda_parallel:
pytest
default (one worker): 33.88spytest -n 16
: 17.00spip install --editable
documentation:Tested interactively with:
Also resolving this warning when testing cuda_cooperative: