Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia nvml destory the start in docker without nvidia gpu #557

Open
2jiangjiang opened this issue Dec 14, 2024 · 4 comments
Open

nvidia nvml destory the start in docker without nvidia gpu #557

2jiangjiang opened this issue Dec 14, 2024 · 4 comments

Comments

@2jiangjiang
Copy link

I have no nvidia gpu and use docker to run exo

  1. docker run ubuntu
  2. git clone exo
  3. apt install build-essential python3 python3-venv python3-pip libgl1-mesa-dev libglib2.0-0
  4. source install.sh
  5. report the error

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Selected inference engine: None


/ _ \ / / _ \
| /> < (_) |
_
/_/____/

Detected system: Linux
Inference engine name after selection: tinygrad
Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader
[58906]
Chat interface started:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/exo/.venv/bin/exo", line 5, in
from exo.main import run
File "/exo/exo/main.py", line 131, in
node = Node(
^^^^^
File "/exo/exo/orchestration/node.py", line 40, in init
self.device_capabilities = device_capabilities()
^^^^^^^^^^^^^^^^^^^^^
File "/exo/exo/topology/device_capabilities.py", line 151, in device_capabilities
return linux_device_capabilities()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/exo/exo/topology/device_capabilities.py", line 189, in linux_device_capabilities
pynvml.nvmlInit()
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2220, in nvmlInit
nvmlInitWithFlags(0)
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2203, in nvmlInitWithFlags
_LoadNvmlLibrary()
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2250, in _LoadNvmlLibrary
_nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
File "/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found

@2jiangjiang
Copy link
Author

2jiangjiang commented Dec 14, 2024

I test that tinygrad.Device.DEFAULT return value "GPU". When I delete "Device.DEFAULT == "GPU"" in nvidia case exo worked.I don't know if it can work properly with oneAPI(Intel GPU)

@AlexCheema
Copy link
Contributor

AlexCheema commented Dec 14, 2024

I test that tinygrad.Device.DEFAULT return value "GPU". When I delete Device.DEFAULT == "NV" in nvidia case exo worked.I don't know if it can work properly with oneAPI(Intel GPU)

You'll need to install the prerequisites listed in the README:

For Linux with NVIDIA GPU support (Linux-only, skip if not using Linux or NVIDIA):

@2jiangjiang
Copy link
Author

我测试了 tinygrad.Device.DEFAULT 返回值“GPU”。当我在 nvidia 情况下删除 Device.DEFAULT ==“NV”时,exo 起作用了。我不知道它是否可以与 oneAPI(Intel GPU)正常工作

您需要安装 README 中列出的先决条件:

对于支持 NVIDIA GPU 的 Linux(仅限 Linux,如果不使用 Linux 或 NVIDIA,请跳过):

I not use NVIDIA GPU,but I use INTEL GPU but the case enter the incorrect NVIDIA case so it was a bug and need patch

@2jiangjiang
Copy link
Author

我测试了 tinygrad.Device.DEFAULT 返回值“GPU”。当我在 nvidia 情况下删除 Device.DEFAULT ==“NV”时,exo 起作用了。我不知道它是否可以与 oneAPI(Intel GPU)正常工作

It's my mistake I have delete "Device.DEFAULT=="GPU"" not "Device.DEFAULT=="NV""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants