Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving/Loading qlora adapters #278

Open
chrisi2045 opened this issue Nov 28, 2023 · 1 comment
Open

Saving/Loading qlora adapters #278

chrisi2045 opened this issue Nov 28, 2023 · 1 comment

Comments

@chrisi2045
Copy link

Hello, my situation is as follows:

I implemented qlora adapter to use with LLMs (currently bloom-560m). It works fine so far, after fine-tuning I get over 90% accuracy on my task. However, after saving and loading the adapter, the accuracy drops down to slightly above guessing.

I assume it is connected to the following behaviour, but I am clueless why this happens:
At creation of the adapter during fine-tuning the function model.print_trainable_parameters() prints:
trainable params: 788,480 || all params: 560,005,120 || trainable%: 0.1407987126974839.
After saving and loading the adapter the same function results in:
trainable params: 2,048 || all params: 560,005,120 || trainable%: 0.00036571094207138677.
If I save the loaded adapter again (without further fine-tuning), it stays at this level after loading it again.

Many thanks in advance,
Christian

Here is my code for creating the adapter...

model_name_or_path = 'bigscience/bloom-560m'
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

# Get model configuration.
model_config = AutoConfig.from_pretrained(pretrained_model_name_or_path=model_name_or_path, num_labels=n_labels)

# Get model's tokenizer.
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path)
# default to left padding
tokenizer.padding_side = "left"
# Define PAD Token = EOS Token = 50256
tokenizer.pad_token = tokenizer.eos_token

# Get the actual model.
model = AutoModelForSequenceClassification.from_pretrained(
    pretrained_model_name_or_path=model_name_or_path,
    config=model_config, 
    torch_dtype=torch.bfloat16,
    device_map={"": 0},
    load_in_4bit=True,
    quantization_config= bnb_config
)

model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

#PEFT
print("loading to LoRA")
peft_config = LoraConfig(task_type="SEQ_CLS", inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()


# resize model embedding to match new tokenizer and fix model padding token id
model.resize_token_embeddings(len(tokenizer))
model.config.pad_token_id = model.config.eos_token_id
model.to(device)

...and saving it.

global model
model.save_pretrained("adapters/lora_adapte_v2")

And here is the code for loading it again:

model_name_or_path = 'bigscience/bloom-560m'
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

model_config = AutoConfig.from_pretrained(pretrained_model_name_or_path=model_name_or_path, num_labels=n_labels)
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path)
# default to left padding
tokenizer.padding_side = "left"
# Define PAD Token = EOS Token = 50256
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForSequenceClassification.from_pretrained(
    pretrained_model_name_or_path=model_name_or_path,
    config=model_config, 
    torch_dtype=torch.bfloat16,
    device_map={"": 0},
    load_in_4bit=True,
    quantization_config= bnb_config
)
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

adapter = load_lora_name
model = PeftModel.from_pretrained(model, adapter)

model.print_trainable_parameters()
    
model.resize_token_embeddings(len(tokenizer))
model.config.pad_token_id = model.config.eos_token_id

model.to(device)
print('Model loaded to `%s`'%device)
@enricoliscio
Copy link

Same problem here! Did you find a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants