-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia nvml destory the start in docker without nvidia gpu #557
Comments
I test that tinygrad.Device.DEFAULT return value "GPU". When I delete "Device.DEFAULT == "GPU"" in nvidia case exo worked.I don't know if it can work properly with oneAPI(Intel GPU) |
You'll need to install the prerequisites listed in the README: For Linux with NVIDIA GPU support (Linux-only, skip if not using Linux or NVIDIA):
|
I not use NVIDIA GPU,but I use INTEL GPU but the case enter the incorrect NVIDIA case so it was a bug and need patch |
It's my mistake I have delete "Device.DEFAULT=="GPU"" not "Device.DEFAULT=="NV"" |
I have no nvidia gpu and use docker to run exo
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Selected inference engine: None
/ _ \ / / _ \
| /> < (_) |
_/_/____/
Detected system: Linux
Inference engine name after selection: tinygrad
Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader
[58906]
Chat interface started:
ChatGPT API endpoint served at:
Traceback (most recent call last):
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2248, in _LoadNvmlLibrary
nvmlLib = CDLL("libnvidia-ml.so.1")
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/ctypes/init.py", line 379, in init
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/exo/.venv/bin/exo", line 5, in
from exo.main import run
File "/exo/exo/main.py", line 131, in
node = Node(
^^^^^
File "/exo/exo/orchestration/node.py", line 40, in init
self.device_capabilities = device_capabilities()
^^^^^^^^^^^^^^^^^^^^^
File "/exo/exo/topology/device_capabilities.py", line 151, in device_capabilities
return linux_device_capabilities()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/exo/exo/topology/device_capabilities.py", line 189, in linux_device_capabilities
pynvml.nvmlInit()
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2220, in nvmlInit
nvmlInitWithFlags(0)
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2203, in nvmlInitWithFlags
_LoadNvmlLibrary()
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2250, in _LoadNvmlLibrary
_nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found
The text was updated successfully, but these errors were encountered: