Skip to content

Releases: LlamaEdge/rag-api-server

LlamaEdge-RAG 0.6.1

24 May 09:56
Compare
Choose a tag to compare
LlamaEdge-RAG 0.6.1 Pre-release
Pre-release

Major changes:

  • Support embedding prompt template for embedding model

LlamaEdge-RAG 0.6.0

18 May 09:40
Compare
Choose a tag to compare

Major changes:

  • Improve --batch-size CLI option: set batch size for chat and embedding models, respectively

LlamaEdge-RAG 0.5.3

14 May 15:35
Compare
Choose a tag to compare

Major changes:

  • Add user header in chat completion responses
  • Support PLUGIN_DEBUG wasm environment variable for debugging the low-level ggml plugin

LlamaEdge-RAG 0.5.2

13 May 03:07
Compare
Choose a tag to compare

Major changes:

  • Llama-api-server
    • Improve error responses
    • Add content-type:application/json header in responses

LlamaEdge-RAG 0.5.1

09 May 15:38
Compare
Choose a tag to compare

Major change:

  • Update /v1/embeddings endpoint to be compatible with OpenAI /v1/embeddings API

LlamaEdge-RAG 0.5.0

09 May 11:44
Compare
Choose a tag to compare

Major change:

  • Update deps: llama-core v0.9.0, endpoints v0.8.0, and chat-prompts v0.7.1

LlamaEdge-RAG 0.4.0

30 Apr 11:16
Compare
Choose a tag to compare

Major changes:

  • New /v1/retrieve endpoint
  • New --rag-policy CLI option

LlamaEdge-RAG 0.3.8

28 Apr 07:42
Compare
Choose a tag to compare

major change: update the chat-prompts dep to 0.6.2

LlamaEdge-RAG 0.3.7

26 Apr 12:58
Compare
Choose a tag to compare

Major change:

  • Post-process the generation of phi-3-chat model in non-stream mode.

LlamaEdge-RAG 0.3.6

24 Apr 10:20
Compare
Choose a tag to compare

Major changes:

  • Post-process the generation of llama-2-chat and llama-3-chat models in non-stream mode.