Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use RAG with Hugging Face transformer like BERT or T5 as those are also LLM #8

Open
ramda1234786 opened this issue Oct 9, 2023 · 2 comments

Comments

@ramda1234786
Copy link

How can I build a RAG with a model other than Open AI, Cohere and Sagemaker.
Can i use hugging face transformer or BERT model for predicting sentences? without having any hugging face key

i am using version 2.10

If yes how to build a http connector for it? or how can i load the model

I tried this below one.

Can i upload model which seq2se2 and does not require http connector of any keys. Basically i do not want to use the API keys or but the product until i am sure this solutions works out

POST /_plugins/_ml/models/_upload

{
  "name": "huggingface/TheBloke/vicuna-13B-1.1-GPTQ",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT"
}
@HenryL27
Copy link
Collaborator

HenryL27 commented Oct 9, 2023

That's a good question, and I'm not sure I have an answer for you. I'm of the opinion that the resources for running OpenSearch and for running LLMs should be separated - also I don't think that ML-Commons supports running seq2seq natively. So I'm pretty sure you need to host the model externally and connect via the Connector interface.
It might be worth checking out something like torchserve or other open source model hosting places?

As far as building connectors goes, it looks like basically what you need to do is tell opensearch how to construct the http request. So to hit the basic example from the torchserve readme I would expect to be able to do something like

POST /_plugins/_ml/connectors/_create
{
    "name": "TorchServe bert connector",
    "description": "The connector to a locally hosted bert model",
    "version": 1,
    "protocol": "http",
    "parameters": {
        "endpoint": "127.0.0.1:8080",
        "model": "bert"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "GET",
            "url": "http://${parameters.endpoint}/predictions/${parameters.model}",
            "request_body": "${parameters.input}"
        }
    ]
}

I haven't tested any of this and I don't think there has been much testing in general of connectors other than OpenAI, Cohere, Sagemaker.

Hope this helps!

@ramda1234786
Copy link
Author

yes above one helped me to create the connector and i have deployed the model as well. Thanks for the guidance
But currently i am stuck with the below issue, i am seeking help from forum of ml

https://forum.opensearch.org/t/not-able-to-run-the-predict-apis-with-external-ml-models-like-hugging-face/16237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants