Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FT] Custom model to TransformersModel #489

Open
Giuseppe5 opened this issue Jan 7, 2025 · 3 comments
Open

[FT] Custom model to TransformersModel #489

Giuseppe5 opened this issue Jan 7, 2025 · 3 comments
Labels
feature request New feature/request

Comments

@Giuseppe5
Copy link

Giuseppe5 commented Jan 7, 2025

Hello everyone,

First of all, thanks for the amazing work.

Issue encountered

I have been trying to use lighteval but I'm facing an issue.

From my current understanding, it is only possible to pass a pretrained string to TransformersModelConfig, which means the model has to be already present on the hub and can't be modified in any way before using it for eval.

I tried passing model directly to Pipeline but I get the following error:

AttributeError: 'TransformersModel' object has no attribute 'generation_config_dict'

Solution/Feature

Instead of passing a pretrained string to TransformersModelConfig, I was wondering if it's possible to pass a torch.nn.Module and use that for evaluation purpose.
The idea is to pass to lighteval a transformed HF model (e.g., after applying quantization through third-party libraries).

Please let me know if I'm using the library wrong or misunderstanding something.

Many thanks,
Giuseppe

@Giuseppe5 Giuseppe5 added the feature request New feature/request label Jan 7, 2025
@clefourrier
Copy link
Member

Hi! Thanks for the kind words!

Easy way is simply to upload your quantized model to the hub, or save it locally and load from there.
You can also use lighteval programmatically (https://huggingface.co/docs/lighteval/using-the-python-api) and load a model directly in Python - you might need to inherit the LightevalModel abstract class :)

@Giuseppe5
Copy link
Author

Yes I have been playing with it through the python API, passing model to Pipeline but I get the error below:

AttributeError: 'TransformersModel' object has no attribute 'generation_config_dict'

This happens also when loading a model from HF and then passing it to pipeline (no quantization or any other modification really to a normal AutoModelForCasualLM)

@Giuseppe5
Copy link
Author

After some experimenting, the issue seems to be due to a mismatch in how TransformersModel gets initalized when calling init compared to what appens with from_model.
There are a few configuration missing in from_model that cause the error and some numerical discrepancies once that error is fixed.

Also, unrelated to this, I noticed some smaller issues around some hyper-params like override_batch_size. If I just remove it, it goes to None and then there's an error when checking if override_batch_size > 0.

A similar issue seems to exist around the accelerator parameter. Related to this, in the API example above, the accelerator variable is created but never used.

I can post a script to reproduce but I basically copy-pasted the API tutorial you linked with some minor modifications here and there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature/request
Projects
None yet
Development

No branches or pull requests

2 participants