Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try new build process - Do not merge! #38

Closed
wants to merge 8 commits into from

Conversation

sameeul
Copy link
Contributor

@sameeul sameeul commented Feb 8, 2024

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • Failed to even lint the recipe, probably because of a conda-smithy bug 😢. This likely indicates a problem in your meta.yaml, though. To get a traceback to help figure out what's going on, install conda-smithy and run conda smithy recipe-lint . from the recipe directory.

@sameeul
Copy link
Contributor Author

sameeul commented Feb 8, 2024

@conda-forge-admin, please rerender

Copy link
Contributor

github-actions bot commented Feb 8, 2024

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you but ran into some issues. Please check the output logs of the latest rerendering GitHub actions workflow run for errors. You can also ping conda-forge/core for further assistance or try re-rendering locally.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/nyxus-feedstock/actions/runs/7833572606.

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@sameeul
Copy link
Contributor Author

sameeul commented Feb 8, 2024

@conda-forge-admin, please rerender

@sameeul
Copy link
Contributor Author

sameeul commented Feb 8, 2024

@conda-forge-admin, please rerender

@@ -6,12 +6,12 @@ package:
version: {{ version }}

source:
url: https://github.com/PolusAI/{{ name }}/archive/refs/tags/{{ version }}.tar.gz
sha256: f3041bb73ccb6390df5c2cf6d8ac7d4f5d91b75a3f31dd7a0664763d964c5138
url: https://github.com/sameeul/nyxus/archive/refs/heads/conda_gpu_v3.zip
Copy link
Member

@jaimergp jaimergp Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not an official release? 🤔 A fork with patches? If this is temporary, ok, but this should become a list of patches underneath, committed to the feedstock, before merge.

Suggested change
url: https://github.com/sameeul/nyxus/archive/refs/heads/conda_gpu_v3.zip
url: https://github.com/sameeul/nyxus/archive/refs/heads/conda_gpu_v3.zip
patches:
- 0001-this-thing.patch
- 0002-that-other-thing.patch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaimergp : this is just to try some stuff before release. We will switch to the original repo when we do the release. We have made some changes to CUDA detection and want to make sure those work before doing a release in the upstream repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, since you are here, any idea why CUDA 12 CIs are missing cuda compiler, at least why CMake fails to find it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If you are only trying some things out, it would be cool to trim the CI matrix a bit. This feedstock has 144 entries. You can add a couple of skip: true lines in meta.yaml and rerender. For example, maybe there's no need to try all the Python versions and you can debug the CUDA stuff with Python 3.9 only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also discussing dropping CUDA 11.2 in the nearish future ( conda-forge/conda-forge-pinning-feedstock#5339 ). CUDA 11.8 supports the same hardware, uses the same package structure in conda-forge, and has seen wide adoption. So may be a better CUDA 11 choice going forward

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. That make sense. I don't think in the meta.yaml I have any specific CUDA version choice. The feedstock renderer generates the CI matrix for me. I am just confused that why CUDA 12 runs are not finding the compilers from CMake. When I look at the packages that get installed in CUDA 11.2/11.8, I see a compiler package getting installed. But I don't see that in CUDA 12 runs. So, I am trying to understand what has changed and what I need to do in my recipe to get the compilers detected by CMake in CUDA 12 runs.

Copy link
Member

@jakirkham jakirkham Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. Added an example on how to skip a CUDA version below

#38 (comment)

Edit: Included an example of how we could limit this to CUDA 12 (as that seems to be of interest in this PR)

@sameeul sameeul changed the title Try new build process Try new build process - Do not merge! Feb 8, 2024
recipe/meta.yaml Outdated
patches:
- 0000-win_2022.patch
build:
number: 0
number: 1
skip: true # [py<36]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we would skip a CUDA version (like 11.2)

Would also want to re-render after making this change to update the CI matrix accordingly

Suggested change
skip: true # [py<36]
skip: true # [py<36 or cuda_compiler_version == "11.2"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we want to only test CUDA 12 for a bit, would try something like this

Again would re-render after making a change like this to update CI

Suggested change
skip: true # [py<36]
skip: true # [py<36 or cuda_compiler_version != "12.0"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Also, do I need anything else to get the CUDA 12 compiler detected. Please see the following output from CUDA 11.2 and CUDA 12 run. CMake fails to find the compiler in CUDA 12 case. Do I need to install any other package?

11.2

2024-02-08T18:04:44.5335279Z   -- Looking for a CUDA compiler
2024-02-08T18:04:46.7857213Z   -- Looking for a CUDA compiler - /home/conda/feedstock_root/build_artifacts/nyxus_1707415201231/_build_env/bin/nvcc
2024-02-08T18:04:46.8232304Z   -- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "11.2.152")

and 12.0

2024-02-08T18:15:45.8472644Z   -- Looking for a CUDA compiler
2024-02-08T18:15:47.1839760Z   -- Looking for a CUDA compiler - NOTFOUND
2024-02-08T18:15:47.2097249Z   -- Found CUDAToolkit: /home/conda/feedstock_root/build_artifacts/nyxus_1707415928413/_build_env/targets/x86_64-linux/include (found version "12.0.76")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share a bit more about the changes that occurred in the build system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the changes we made in CMakeLists.txt.

Previously, by default, USEGPU was set to OFF. CMake would look for CUDA compiler (by using check_language(CUDA) and CUDA toolkit (by using find_package(CUDAToolkit)). If both were true, it would set USEGPU to ON, which will ensure all the *cu files will be compiled. If the user passed -DUSEGPU=ON in CMAKE_ARGS and the aforementioned checks failed, it would just emit a warning saying it could not find CUDA , set USEGPU to OFF and move on.

We have modified the behavior the following way now: USEGPU is set to OFF be default. Now, CMake will look for CUDA only if the user sets USEGPU to ON. It will check for CUDA compiler and CUDATooklit (same way as mentioned earlier) and if any of these check fails, it will issue a FATAL message saying CUDA could not be found.

On the feedstock side, we modified build.sh and bld.bat to look for nvcc command (in this PR as test) and based on that pass -DUSEGPU=ON in the CMAKE_ARGS. So, for CUDA 11.2/11.8/12.0, the build.sh correctly identifies the presence of nvcc and adds -DUSEGPU=ON to CMAKE_ARGS. However, we see that for CUDA 12 builds, the check_language(CUDA) call fails to find the nvcc compiler. However, in 11.2 and 11.8, it works. So, I am trying to understand whether I need to add any other package/meta-package in the host section to make sure CMake finds CUDA.

It is also working as expected for Windows builds.

Also, when we first added CUDA 12 support, I verified from the logs that things were compiled correctly. However, I did not do that in our last release (~Dec 23). So, I can not say when this regression occurred, since the previous behavior was to emit a warning and keep moving.

Hope this answers your question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like I had to add - cuda-version {{ cuda_compiler_version }} # [cuda_compiler_version != "None"] to the host section to make CMake find the CUDA compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. That makes more sense

Yeah cuda-version needs to be constrained in requirements/host to align it with requirements/build so they have the same CUDA compiler version

We have had some discussion about adding functionality to conda & conda-build to automate this

@sameeul
Copy link
Contributor Author

sameeul commented Feb 9, 2024

@conda-forge-admin, please rerender

@sameeul sameeul closed this Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants