LlamaEdge-RAG 0.10.0
Major changes:
-
Support multiple collections ( Fixes #28 )
-
Improve
--qdrant-collection-name
,--qdrant-limit
, and--qdrant-score-threshold
CLI options to support both single value and multiple comma-separated values, for examplewasmedge --dir .:. \ --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \ --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5-f16.gguf \ rag-api-server.wasm \ ... --qdrant-url http://127.0.0.1:6333 \ --qdrant-collection-name paris,paris2 \ --qdrant-limit 2,3 \ --qdrant-score-threshold 0.5,0.6 \ ...
-
For the requests to both
/v1/chat/completions
and/v1/retrieve
endpoints,url_vdb_server
,collection_name
,limit
, andscore_threshold
fields support both single and multiple values. For example,-
Multiple values
curl --location 'http://localhost:8080/v1/retrieve' \ --header 'Content-Type: application/json' \ --data '{ "messages": [ ... ], ..., "url_vdb_server": "http://127.0.0.1:6333", "collection_name": ["paris","paris2"], "limit": [3,3], "score_threshold": [0.7,0.7], ... }'
-
Single value
curl --location 'http://localhost:8080/v1/retrieve' \ --header 'Content-Type: application/json' \ --data '{ "messages": [ ... ], ..., "url_vdb_server": "http://127.0.0.1:6333", "collection_name": ["paris"], "limit": [3], "score_threshold": [0.7], ... }'
-
-
-
Remove duplicated RAG search results ( Fixes #27 )
-
Upgrade dependencies:
llama-core v0.23.4
chat-prompts v0.18.1
endpoints v0.20.0