Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to download the llama3.1 model #14

Open
jedld opened this issue Oct 7, 2024 · 4 comments
Open

Unable to download the llama3.1 model #14

jedld opened this issue Oct 7, 2024 · 4 comments

Comments

@jedld
Copy link

jedld commented Oct 7, 2024

When attempting to download "llama3.1" via the download new model UI, I'm getting:

It looks like "llama3.1" is not the right name.

This error does not happen for the other llama models.

@brkstyl
Copy link

brkstyl commented Oct 11, 2024

I am also having a similar problem, I am trying to load any additional model from Gemma to Starcoder, and none of the ollama models library is loading, and gives me the same error

@andrebaumgartfht
Copy link

We are having the same issue. Is there any specific versoin used? We are using Jetpack 6.1 [L4T 36.4.0].
Thanks.

@dusty-nv
Copy link
Member

Hi @andrebaumgartfht can you try dustynv/jetson-copilot:r36.4.0 ?

@andrebaumgartfht
Copy link

Thank you for the build.

Did the following:
Created documents in folder in my jetson-container repo locally (mkdir -p ./data/documents/jetson) and added a pdf file to folder as an empty folder will cause another exception (potentially a defaulted RAG only container).

Then started the container using jetson-containers run dustynv/jetson-copilot:r36.4.0 bash -c '/start_ollama && streamlit run app.py.

Loading and indexing the Jetson docs startet ...

Then raised the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/scriptrunner/script_runner.py", line 579, in code_to_exec
exec(code, module.dict)
File "/opt/jetson-copilot/app.py", line 55, in
index = load_data()
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/caching/cache_utils.py", line 212, in call
return self._get_or_create_cached_value(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/caching/cache_utils.py", line 235, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/caching/cache_utils.py", line 292, in _handle_cache_miss
computed_value = self._info.func(*func_args, **func_kwargs)
File "/opt/jetson-copilot/app.py", line 52, in load_data
index = VectorStoreIndex.from_documents(docs)
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/base.py", line 119, in from_documents
return cls(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/vector_store/base.py", line 76, in init
super().init(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/base.py", line 77, in init
index_struct = self.build_index_from_nodes(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/vector_store/base.py", line 310, in build_index_from_nodes
return self._build_index_from_nodes(content_nodes, **insert_kwargs)
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/vector_store/base.py", line 279, in _build_index_from_nodes
self._add_nodes_to_index(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/vector_store/base.py", line 232, in _add_nodes_to_index
nodes_batch = self._get_node_with_embedding(nodes_batch, show_progress)
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/vector_store/base.py", line 139, in _get_node_with_embedding
id_to_embed_map = embed_nodes(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/utils.py", line 138, in embed_nodes
new_embeddings = embed_model.get_text_embedding_batch(
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/instrumentation/dispatcher.py", line 307, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/llama_index/core/base/embeddings/base.py", line 335, in get_text_embedding_batch
embeddings = self._get_text_embeddings(cur_batch)
File "/usr/local/lib/python3.10/dist-packages/llama_index/embeddings/ollama/base.py", line 75, in _get_text_embeddings
embeddings = self.get_general_text_embedding(text)
File "/usr/local/lib/python3.10/dist-packages/llama_index/embeddings/ollama/base.py", line 88, in get_general_text_embedding
result = self._client.embeddings(
File "/usr/local/lib/python3.10/dist-packages/ollama/_client.py", line 281, in embeddings
return self._request(
File "/usr/local/lib/python3.10/dist-packages/ollama/_client.py", line 75, in _request
raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: llama runner process has terminated: signal: aborted (core dumped) CUDA error: CUBLAS_STATUS_INTERNAL_ERROR
current device: 0, in function ggml_cuda_op_mul_mat_cublas at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:1240
cublasGemmEx(ctx.cublas_handle(id), CUBLAS_OP_T, CUBLAS_OP_N, row_diff, src1_ncols, ne10, &alpha_f16, src0_ptr, CUDA_R_16F, ne00, src1_ptr, CUDA_R_16F, ne10, &beta_f16, dst_f16.get(), CUDA_R_16F, ldc, CUBLAS_COMPUTE_16F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:100: !"CUDA error"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants