-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Various CI fixes #11196
[CI] Various CI fixes #11196
Conversation
* Fix dmlc#10752 * [CI] Replace Mambaforge -> Miniforge3 * Fix formatting
* Fix tests with the latest scikit-learn. * dask. * Remove scikit-learn pin --------- Co-authored-by: Philip Hyunsu Cho <[email protected]>
Seeing this error on CI: /home/runner/work/xgboost/xgboost/dmlc-core/include/dmlc/omp.h:11:10: error: 'omp.h' file not found with <angled> include; use "quotes" instead
11 | #include <omp.h>
| ^~~~~~~
| "omp.h" It does appear to find OpenMP earlier in the log:
So is this just a matter of using |
Let me backport #10987 and see if that fixes the error. |
Thanks Hyunsu! 🙏 Also there were some interesting mypy errors on CI: xgboost/dask/__init__.py:1651: error: Function does not return a value (it only ever returns None) [func-returns-value]
xgboost/dask/__init__.py:1661: error: Argument "data" to "predict" has incompatible type "None"; expected "Union[DaskDMatrix, Union[Array, DataFrame]]" [arg-type]
xgboost/dask/__init__.py:1692: error: Function does not return a value (it only ever returns None) [func-returns-value]
xgboost/dask/__init__.py:1701: error: Argument "data" to "predict" has incompatible type "None"; expected "Union[DaskDMatrix, Union[Array, DataFrame]]" [arg-type]
Found 4 errors in 1 file (checked 40 source files)
...
/home/runner/work/xgboost/xgboost/tests/test_distributed/test_gpu_with_dask/test_gpu_with_dask.py:656: error: Function does not return a value (it only ever returns None) [func-returns-value]
/home/runner/work/xgboost/xgboost/tests/test_distributed/test_gpu_with_dask/test_gpu_with_dask.py:661: error: Argument 3 to "predict" has incompatible type "None"; expected "Union[DaskDMatrix, Union[Array, DataFrame]]" [arg-type]
Found 2 errors in 1 file (checked 1 source file)
mypy 1.11.2 (compiled: yes) AFAICT the relevant functions return Also worth noting the test it is referencing uses xgboost/tests/test_distributed/test_gpu_with_dask/test_gpu_with_dask.py Lines 654 to 661 in a406528
So maybe it is just confused about |
We can ignore errors from MyPy, since they are likely due to changes in the latest MyPy. |
Thanks Hyunsu! 🙏 Guessing we can ignore the C++ lints as well? Also seeing this comm cleanup error in another CI job: [22:30:40] WARNING: /home/runner/work/xgboost/xgboost/src/collective/comm.cc:358: The communicator is being destroyed without a call to shutdown first. This can lead to undefined behaviour.
[22:30:40] WARNING: /home/runner/work/xgboost/xgboost/src/collective/socket.cc:143: socket.cc(186): Failed to connect to:192.168.122.156:14345 Error:
- [socket.h:357|22:30:40]: Socket error. system error:Connection refused
[22:30:40] WARNING: /home/runner/work/xgboost/xgboost/src/collective/socket.cc:150: Retrying connection to 192.168.122.156 for the 1 time.
[22:30:41] WARNING: /home/runner/work/xgboost/xgboost/src/collective/socket.cc:143: socket.cc(162): Failed to connect to:192.168.122.156:14345 Error:
- [socket.cc:161|22:30:41]: connect failed. system error:Connection refused
[22:30:41] WARNING: /home/runner/work/xgboost/xgboost/src/collective/comm.cc:362:
- [comm.cc:40|22:30:41]: Failed to connect to the tracker.
- [socket.cc:196|22:30:41]: Failed to connect to 192.168.122.156:14345
- [socket.cc:161|22:30:41]: connect failed. Connection refused
Terminating due to uncaught exception 0x11b7421a000 of type dmlc::Error
Abort trap (core dumped) |
To avoid CI failures on FreeBSD.
The errors from the FreeBSD job was fixed in #10756. Let me backport the fix. |
The timeout error from a Dask test in https://buildkite.com/xgboost/xgboost-ci-multi-gpu/builds/7734#0194bfca-26b2-464c-9450-ae01f6369bc4 appears to be particular with the Dask version used in an old version of RAPIDS (RAPIDS 24.06, Dask 2024.5.1). I will address the failure in a follow-up PR, which will upgrade the CUDA version (to 12.8) as well as RAPIDS version (to 24.12). |
Thanks Hyunsu! 🙏 |
No description provided.