You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I implemented qlora adapter to use with LLMs (currently bloom-560m). It works fine so far, after fine-tuning I get over 90% accuracy on my task. However, after saving and loading the adapter, the accuracy drops down to slightly above guessing.
I assume it is connected to the following behaviour, but I am clueless why this happens:
At creation of the adapter during fine-tuning the function model.print_trainable_parameters() prints: trainable params: 788,480 || all params: 560,005,120 || trainable%: 0.1407987126974839.
After saving and loading the adapter the same function results in: trainable params: 2,048 || all params: 560,005,120 || trainable%: 0.00036571094207138677.
If I save the loaded adapter again (without further fine-tuning), it stays at this level after loading it again.
Many thanks in advance,
Christian
Here is my code for creating the adapter...
model_name_or_path='bigscience/bloom-560m'bnb_config=BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
# Get model configuration.model_config=AutoConfig.from_pretrained(pretrained_model_name_or_path=model_name_or_path, num_labels=n_labels)
# Get model's tokenizer.tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path)
# default to left paddingtokenizer.padding_side="left"# Define PAD Token = EOS Token = 50256tokenizer.pad_token=tokenizer.eos_token# Get the actual model.model=AutoModelForSequenceClassification.from_pretrained(
pretrained_model_name_or_path=model_name_or_path,
config=model_config,
torch_dtype=torch.bfloat16,
device_map={"": 0},
load_in_4bit=True,
quantization_config=bnb_config
)
model.gradient_checkpointing_enable()
model=prepare_model_for_kbit_training(model)
#PEFTprint("loading to LoRA")
peft_config=LoraConfig(task_type="SEQ_CLS", inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
model=get_peft_model(model, peft_config)
model.print_trainable_parameters()
# resize model embedding to match new tokenizer and fix model padding token idmodel.resize_token_embeddings(len(tokenizer))
model.config.pad_token_id=model.config.eos_token_idmodel.to(device)
Hello, my situation is as follows:
I implemented qlora adapter to use with LLMs (currently bloom-560m). It works fine so far, after fine-tuning I get over 90% accuracy on my task. However, after saving and loading the adapter, the accuracy drops down to slightly above guessing.
I assume it is connected to the following behaviour, but I am clueless why this happens:
At creation of the adapter during fine-tuning the function
model.print_trainable_parameters()
prints:trainable params: 788,480 || all params: 560,005,120 || trainable%: 0.1407987126974839.
After saving and loading the adapter the same function results in:
trainable params: 2,048 || all params: 560,005,120 || trainable%: 0.00036571094207138677.
If I save the loaded adapter again (without further fine-tuning), it stays at this level after loading it again.
Many thanks in advance,
Christian
Here is my code for creating the adapter...
...and saving it.
And here is the code for loading it again:
The text was updated successfully, but these errors were encountered: