Skip to content

LlamaEdge-RAG 0.10.0

Compare
Choose a tag to compare
@github-actions github-actions released this 08 Dec 14:09
· 43 commits to main since this release

Major changes:

  • Support multiple collections ( Fixes #28 )

    • Improve --qdrant-collection-name, --qdrant-limit, and --qdrant-score-threshold CLI options to support both single value and multiple comma-separated values, for example

      wasmedge --dir .:. \
      --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \
      --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5-f16.gguf \
      rag-api-server.wasm \
      ...
      --qdrant-url http://127.0.0.1:6333 \
      --qdrant-collection-name paris,paris2 \
      --qdrant-limit 2,3 \
      --qdrant-score-threshold 0.5,0.6 \
      ...
    • For the requests to both /v1/chat/completions and /v1/retrieve endpoints, url_vdb_server, collection_name, limit, and score_threshold fields support both single and multiple values. For example,

      • Multiple values

        curl --location 'http://localhost:8080/v1/retrieve' \
        --header 'Content-Type: application/json' \
        --data '{
            "messages": [
                ...
            ],
            ...,
            "url_vdb_server": "http://127.0.0.1:6333",
            "collection_name": ["paris","paris2"],
            "limit": [3,3],
            "score_threshold": [0.7,0.7],
            ...
        }'
      • Single value

          curl --location 'http://localhost:8080/v1/retrieve' \
          --header 'Content-Type: application/json' \
          --data '{
              "messages": [
                  ...
              ],
              ...,
              "url_vdb_server": "http://127.0.0.1:6333",
              "collection_name": ["paris"],
              "limit": [3],
              "score_threshold": [0.7],
              ...
          }'
  • Remove duplicated RAG search results ( Fixes #27 )

  • Upgrade dependencies:

    • llama-core v0.23.4
    • chat-prompts v0.18.1
    • endpoints v0.20.0