-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starcoder2-15B model - AttributeError: 'TensorParallelColumnLinear' object has no attribute [rank3]: 'base_layer' #2881
Comments
Hi @ashwincv0112 thank you for opening this issue, it appears that the starcoder2 modeling code has not been updated to handle multi lora correctly. I've started a PR here with changes to enable multi lora, please see the PR for change details: #2883. Thank you! |
Hi @drbh thank you for the quick fix. Appreciate it. |
Hi @ashwincv0112, @drbh I'm getting the same error using Mixtral-8x7B-v0.1. I also am using custom LoRA adapters loaded from my local. Is Mixtral 8x7B supported for multi-lora serving? What can I do to make this work? |
Hi @ashwincv0112, the PR should be merged soon - just working on adding some tests today. Regarding the versioning once the changes are merged - the changes will not be contained in @vsoesanto thank you for reporting this, mixtral should support lora - would you be able to share an example of the error message? Thank you! |
The error message is identical to what @ashwincv0112 got above My base model is Mixtral-8x7B-v0.1. I am also using custom lora adapters loaded from local. Also using similar docker run command above with tgi==3.0.1. Looking at the file OP linked, I am seeing Looking at the file changed in your PR above, I don't see the same changes applied to Starcoder2 applied to the file for Mixtral . But maybe i'm not looking at the right file. Any insight would be very helpful. |
Has there been any updates on this? @drbh |
System Info
Using the below TGI version:
ghcr.io/huggingface/text-generation-inference:3.0.1
Running on AWS g5.12xlarge instance (which is having 4 GPUs)
model used: bigcode/starcoder2-15b-instruct-v0.1
Deployment: Using docker
Information
Tasks
Reproduction
Please be informed that, we are trying to deploy starcoder2-15 instruct model with custom fine-tuned LoRA Adpaters using TGI multi-lora capability.
We are using AWS g5.12xlarge instance for this.
We have our base model and lora adapters saved in the data directory. We then ran the below docker command.
Requirement:
Base Model: bigcode/starcoder2-15b-instruct-v0.1
Custom LoRA adapters.
AWS g5.12xlarge instance.
On running the above docker command, we are getting the below error:
Also, one of the observation, in the below file we were able to see Starcoder2-15 instruct model mentiond and was our understanding that the model is supported for the multi-lora functionality using TGI.
https://github.com/huggingface/text-generation-inference/blob/main/server/text_generation_server/models/__init__.py
Please let us know if there are any gaps in our understanding.
If the Starcoder2-15B model is supported, could you help in resolving the above issue.
Thanks.
Expected behavior
The model should be deployed along with the multi-lora TGI functionality.
The text was updated successfully, but these errors were encountered: