You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey team, I'm using the multi-lora adapter deployment feature from the latest code. I've couple of questions regarding the feature.
My questions are:
What is maximum limit on the number of local adapters that we can deploy on instance? Is there any limit put by TGI engine or will it depends on the instance capacity?
How these adapters are stored? Are they getting stored in the disk or GPU memory?
What will happen if the adapter_id requested is not present in the current GPU memory? will it load from the disk?
Could we specify the number of adapters that can live in GPU memory explicitly?
The text was updated successfully, but these errors were encountered:
Hey team, I'm using the multi-lora adapter deployment feature from the latest code. I've couple of questions regarding the feature.
My questions are:
The text was updated successfully, but these errors were encountered: