Support for phi3 - llama.cpp update #99

superchargez · 2024-04-28T18:30:41Z

I already downloaded phi3 instruct gguf from: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf

and placed it at: /models/
jawad@desktoper:/models$ ls
Phi-3-mini-4k-instruct-q4.gguf

Yet, when I give command to choose model with -m or --model I get error (it returns the output as for help ./server.sh --help) here is complete output:
jawad@desktoper:~/gits/aici/rllm/rllm-llamacpp$ ./server.sh -m /home/jawad/models/Phi-3-mini-4k-instruct-q4.gguf
usage: server.sh [--loop] [--cuda] [--debug] [model_name] [rllm_args...]

model_name can a HuggingFace URL pointing to a .gguf file, or one of the following:

phi2 https://huggingface.co/TheBloke/phi-2-GGUF/blob/main/phi-2.Q8_0.gguf
orca https://huggingface.co/TheBloke/Orca-2-13B-GGUF/blob/main/orca-2-13b.Q8_0.gguf
mistral https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/blob/main/mistral-7b-instruct-v0.2.Q5_K_M.gguf
mixtral https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/blob/main/mixtral-8x7b-instruct-v0.1.Q6_K.gguf
code70 https://huggingface.co/TheBloke/CodeLlama-70B-Instruct-GGUF/blob/main/codellama-70b-instruct.Q5_K_M.gguf

Additionally, "server.sh build" will just build the server, and not run a model.

--cuda try to build llama.cpp against installed CUDA
--loop restart server when it crashes and store logs in ./logs
--debug don't build in --release mode

Try server.sh phi2 --help to see available rllm_args

Though if I choose phi2 instead of downloaded model it works fine. Does aici not support phi3 or is this a bug, and how to fix it? (I could adding a line for phi3 after this:
63 phi2 )
64 ARGS="-m https://huggingface.co/TheBloke/phi-2-GGUF/blob/main/phi-2.Q8_0.gguf -t phi -w $EXPECTED/phi-2/cats.safetensors -s test_max tol=0.8 -s test_avgtol=0.3"
65 ;;

in server.sh (rllm-cuda/server.sh) solve the problem if I just replace the URL? But I don't want to download the model again, so, how can I use the local model which is not in the list of models i.e. phi2, mistral, mixtral etc?

The text was updated successfully, but these errors were encountered:

mmoskal · 2024-04-29T18:35:37Z

This requires upgrading version of llama.cpp used. I should get to this sometimes this week or next.

superchargez · 2024-04-30T06:26:22Z

Is not it possible to spun up a llamacpp server and reference it in aici.sh? Will this work?

s-zanella · 2024-10-18T11:20:40Z

I wonder if there any updates on this or on the rllm-cuda backend that would enable to run AICI with newer models (e.g., Phi-3.5, Llama-3.2)?

mmoskal self-assigned this Apr 29, 2024

mmoskal changed the title ~~Support for phi3~~ Support for phi3 - llama.cpp update Apr 29, 2024

mmoskal mentioned this issue Apr 29, 2024

update llama.cpp to b2757 #101

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for phi3 - llama.cpp update #99

Support for phi3 - llama.cpp update #99

superchargez commented Apr 28, 2024

mmoskal commented Apr 29, 2024

superchargez commented Apr 30, 2024

s-zanella commented Oct 18, 2024

Support for phi3 - llama.cpp update #99

Support for phi3 - llama.cpp update #99

Comments

superchargez commented Apr 28, 2024

mmoskal commented Apr 29, 2024

superchargez commented Apr 30, 2024

s-zanella commented Oct 18, 2024