[FT] Custom model to TransformersModel #489

Giuseppe5 · 2025-01-07T14:29:25Z

Hello everyone,

First of all, thanks for the amazing work.

Issue encountered

I have been trying to use lighteval but I'm facing an issue.

From my current understanding, it is only possible to pass a pretrained string to TransformersModelConfig, which means the model has to be already present on the hub and can't be modified in any way before using it for eval.

I tried passing model directly to Pipeline but I get the following error:

AttributeError: 'TransformersModel' object has no attribute 'generation_config_dict'

Solution/Feature

Instead of passing a pretrained string to TransformersModelConfig, I was wondering if it's possible to pass a torch.nn.Module and use that for evaluation purpose.
The idea is to pass to lighteval a transformed HF model (e.g., after applying quantization through third-party libraries).

Please let me know if I'm using the library wrong or misunderstanding something.

Many thanks,
Giuseppe

The text was updated successfully, but these errors were encountered:

clefourrier · 2025-01-07T14:34:03Z

Hi! Thanks for the kind words!

Easy way is simply to upload your quantized model to the hub, or save it locally and load from there.
You can also use lighteval programmatically (https://huggingface.co/docs/lighteval/using-the-python-api) and load a model directly in Python - you might need to inherit the LightevalModel abstract class :)

Giuseppe5 · 2025-01-07T14:39:37Z

Yes I have been playing with it through the python API, passing model to Pipeline but I get the error below:

AttributeError: 'TransformersModel' object has no attribute 'generation_config_dict'

This happens also when loading a model from HF and then passing it to pipeline (no quantization or any other modification really to a normal AutoModelForCasualLM)

Giuseppe5 · 2025-01-08T12:24:54Z

After some experimenting, the issue seems to be due to a mismatch in how TransformersModel gets initalized when calling init compared to what appens with from_model.
There are a few configuration missing in from_model that cause the error and some numerical discrepancies once that error is fixed.

Also, unrelated to this, I noticed some smaller issues around some hyper-params like override_batch_size. If I just remove it, it goes to None and then there's an error when checking if override_batch_size > 0.

A similar issue seems to exist around the accelerator parameter. Related to this, in the API example above, the accelerator variable is created but never used.

I can post a script to reproduce but I basically copy-pasted the API tutorial you linked with some minor modifications here and there.

Giuseppe5 added the feature request New feature/request label Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FT] Custom model to TransformersModel #489

[FT] Custom model to TransformersModel #489

Giuseppe5 commented Jan 7, 2025 •

edited

Loading

clefourrier commented Jan 7, 2025

Giuseppe5 commented Jan 7, 2025

Giuseppe5 commented Jan 8, 2025

[FT] Custom model to TransformersModel #489

[FT] Custom model to TransformersModel #489

Comments

Giuseppe5 commented Jan 7, 2025 • edited Loading

Issue encountered

Solution/Feature

clefourrier commented Jan 7, 2025

Giuseppe5 commented Jan 7, 2025

Giuseppe5 commented Jan 8, 2025

Giuseppe5 commented Jan 7, 2025 •

edited

Loading