Skip to content

Commit

Permalink
Merge branch 'main' of into pr-527
Browse files Browse the repository at this point in the history
  • Loading branch information
tomaarsen committed Sep 12, 2024
2 parents 74bfc12 + e78bfa1 commit 7fbd89c
Show file tree
Hide file tree
Showing 21 changed files with 63 additions and 62 deletions.
10 changes: 9 additions & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,20 @@ jobs:
with:
python-version: ${{ matrix.python-version }}

# Taken from https://github.com/actions/cache?tab=readme-ov-file#creating-a-cache-key
# Use date to invalidate cache every week
- name: Get date
id: get-date
run: |
echo "date=$(/bin/date -u "+%G%V")" >> $GITHUB_OUTPUT
shell: bash

- name: Try to load cached dependencies
uses: actions/cache@v3
id: restore-cache
with:
path: ${{ env.pythonLocation }}
key: python-dependencies-${{ matrix.os }}-${{ matrix.python-version }}-${{ matrix.requirements }}-${{ hashFiles('setup.py') }}-${{ env.pythonLocation }}
key: python-dependencies-${{ matrix.os }}-${{ steps.get-date.outputs.date }}-${{ matrix.python-version }}-${{ matrix.requirements }}-${{ hashFiles('setup.py') }}-${{ env.pythonLocation }}

- name: Install external dependencies on cache miss
run: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ model = SetFitModel.from_pretrained(
args = TrainingArguments(
batch_size=16,
num_epochs=4,
evaluation_strategy="epoch",
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/how_to/absa.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ args = TrainingArguments(
num_epochs=5,
use_amp=True,
batch_size=128,
evaluation_strategy="steps",
eval_strategy="steps",
eval_steps=50,
save_steps=50,
load_best_model_at_end=True,
Expand Down
12 changes: 6 additions & 6 deletions docs/source/en/how_to/v1.0.0_migration_guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ This list contains new functionality that can be used starting from v1.0.0.
* [`AbsaTrainer`] and [`AbsaModel`] have been introduced for applying [SetFit for Aspect Based Sentiment Analysis](absa).
* [`Trainer`] now supports a `callbacks` argument for a list of [`transformers` `TrainerCallback` instances](https://huggingface.co/docs/transformers/main/en/main_classes/callback).
* By default, all installed callbacks integrated with `transformers` are supported, including [`TensorBoardCallback`](https://huggingface.co/docs/transformers/main/en/main_classes/callback#transformers.integrations.TensorBoardCallback), [`WandbCallback`](https://huggingface.co/docs/transformers/main/en/main_classes/callback#transformers.integrations.WandbCallback) to log training logs to [TensorBoard](https://www.tensorflow.org/tensorboard) and [W&B](https://wandb.ai), respectively.
* The [`Trainer`] will now print `embedding_loss` in the terminal, as well as `eval_embedding_loss` if `evaluation_strategy` is set to `"epoch"` or `"steps"` in [`TrainingArguments`].
* The [`Trainer`] will now print `embedding_loss` in the terminal, as well as `eval_embedding_loss` if `eval_strategy` is set to `"epoch"` or `"steps"` in [`TrainingArguments`].
* [`Trainer.evaluate`] now works with string labels.
* An updated contrastive pair sampler increases the variety of training pairs.
* [`TrainingArguments`] supports various new arguments:
Expand All @@ -65,14 +65,14 @@ This list contains new functionality that can be used starting from v1.0.0.

* `logging_first_step`: Whether to log and evaluate the first `global_step` or not.
* `logging_steps`: Number of update steps between two logs if `logging_strategy="steps"`.
* `evaluation_strategy`: The evaluation strategy to adopt during training. Possible values are:
* `eval_strategy`: The evaluation strategy to adopt during training. Possible values are:

- `"no"`: No evaluation is done during training.
- `"steps"`: Evaluation is done (and logged) every `eval_steps`.
- `"epoch"`: Evaluation is done at the end of each epoch.

* `eval_steps`: Number of update steps between two evaluations if `evaluation_strategy="steps"`. Will default to the same as `logging_steps` if not set.
* `eval_delay`: Number of epochs or steps to wait for before the first evaluation can be performed, depending on the `evaluation_strategy`.
* `eval_steps`: Number of update steps between two evaluations if `eval_strategy="steps"`. Will default to the same as `logging_steps` if not set.
* `eval_delay`: Number of epochs or steps to wait for before the first evaluation can be performed, depending on the `eval_strategy`.
* `eval_max_steps`: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.
* `save_strategy`: The checkpoint save strategy to adopt during training. Possible values are:

Expand All @@ -81,12 +81,12 @@ This list contains new functionality that can be used starting from v1.0.0.
- `"steps"`: Save is done every `save_steps`.

* `save_steps`: Number of updates steps before two checkpoint saves if `save_strategy="steps"`.
* `save_total_limit`: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in `output_dir`. Note, the best model is always preserved if the `evaluation_strategy` is not `"no"`.
* `save_total_limit`: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in `output_dir`. Note, the best model is always preserved if the `eval_strategy` is not `"no"`.
* `load_best_model_at_end`: Whether or not to load the best model found during training at the end of training.

<Tip>

When set to `True`, the parameters `save_strategy` needs to be the same as `evaluation_strategy`, and in
When set to `True`, the parameters `save_strategy` needs to be the same as `eval_strategy`, and in
the case it is "steps", `save_steps` must be a round multiple of `eval_steps`.

</Tip>
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/installation.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# Installation

Before you start, you'll need to setup your environment and install the appropriate packages. 🤗 SetFit is tested on **Python 3.7+**.
Before you start, you'll need to setup your environment and install the appropriate packages. 🤗 SetFit is tested on **Python 3.8+**.

## pip

Expand Down
2 changes: 1 addition & 1 deletion scripts/perfect/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Follow the steps below to run the baselines based on the `PERFECT` paper: [_PERF
To get started, first create a Python virtual environment, e.g. with `conda`:

```
conda create -n baselines-perfect python=3.7 && conda activate baselines-perfect
conda create -n baselines-perfect python=3.10 && conda activate baselines-perfect
```

Next, clone [our fork](https://github.com/SetFit/perfect) of the [`PERFECT` codebase](https://github.com/facebookresearch/perfect), and install the required dependencies:
Expand Down
2 changes: 1 addition & 1 deletion scripts/setfit/distillation_baseline.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def standard_model_distillation(self, train_raw_student, x_test, y_test, num_cla
per_device_train_batch_size=self.batch_size,
per_device_eval_batch_size=self.batch_size,
num_train_epochs=self.num_epochs,
evaluation_strategy="no",
eval_strategy="no",
save_strategy="no",
load_best_model_at_end=False,
weight_decay=0.01,
Expand Down
6 changes: 3 additions & 3 deletions scripts/setfit/run_fewshot.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def parse_args():
parser.add_argument("--override_results", default=False, action="store_true")
parser.add_argument("--keep_body_frozen", default=False, action="store_true")
parser.add_argument("--add_data_augmentation", default=False)
parser.add_argument("--evaluation_strategy", default=False)
parser.add_argument("--eval_strategy", default=False)

args = parser.parse_args()

Expand Down Expand Up @@ -149,8 +149,8 @@ def main():
num_epochs=args.num_epochs,
num_iterations=args.num_iterations,
)
if not args.evaluation_strategy:
trainer.args.evaluation_strategy = "no"
if not args.eval_strategy:
trainer.args.eval_strategy = "no"
if args.classifier == "pytorch":
trainer.freeze()
trainer.train()
Expand Down
2 changes: 1 addition & 1 deletion scripts/tfew/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ These scripts run the baselines based on the `T-Few` paper: [_Few-Shot Parameter
To run the scripts, first create a Python virtual environment, e.g. with `conda`:

```
conda create -n baselines-tfew python=3.7 && conda activate baselines-tfew
conda create -n baselines-tfew python=3.10 && conda activate baselines-tfew
```

Next, clone our `T-Few` fork, and install the required dependencies:
Expand Down
2 changes: 1 addition & 1 deletion scripts/transformers/run_fewshot.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def compute_metrics(pred):
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.01,
evaluation_strategy="epoch",
eval_strategy="epoch",
logging_steps=100,
save_strategy="no",
fp16=True,
Expand Down
2 changes: 1 addition & 1 deletion scripts/transformers/run_fewshot_multilingual.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ def compute_metrics(pred):
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.01,
evaluation_strategy="epoch",
eval_strategy="epoch",
logging_steps=100,
save_strategy="no",
fp16=True,
Expand Down
2 changes: 1 addition & 1 deletion scripts/transformers/run_full.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def compute_metrics(pred):
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.001,
evaluation_strategy="epoch",
eval_strategy="epoch",
logging_steps=100,
metric_for_best_model=metric,
load_best_model_at_end=True,
Expand Down
2 changes: 1 addition & 1 deletion scripts/transformers/run_full_multilingual.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ def compute_metrics(pred):
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.01,
evaluation_strategy="epoch",
eval_strategy="epoch",
logging_steps=100,
metric_for_best_model="eval_loss",
load_best_model_at_end=True,
Expand Down
2 changes: 1 addition & 1 deletion src/setfit/model_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def on_train_begin(
"logging_strategy",
"logging_first_step",
"logging_steps",
"evaluation_strategy",
"eval_strategy",
"eval_steps",
"eval_delay",
"save_strategy",
Expand Down
9 changes: 1 addition & 8 deletions src/setfit/modeling.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,7 @@
import warnings
from dataclasses import dataclass, field
from pathlib import Path
from typing import Dict, List, Optional, Set, Tuple, Union


# For Python 3.7 compatibility
try:
from typing import Literal
except ImportError:
from typing_extensions import Literal
from typing import Dict, List, Literal, Optional, Set, Tuple, Union

import joblib
import numpy as np
Expand Down
11 changes: 2 additions & 9 deletions src/setfit/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import time
import warnings
from pathlib import Path
from typing import TYPE_CHECKING, Any, Callable, Dict, Iterable, List, Optional, Tuple, Union
from typing import TYPE_CHECKING, Any, Callable, Dict, Iterable, List, Literal, Optional, Tuple, Union

import evaluate
import torch
Expand Down Expand Up @@ -48,13 +48,6 @@
from .utils import BestRun, default_hp_space_optuna


# For Python 3.7 compatibility
try:
from typing import Literal
except ImportError:
from typing_extensions import Literal


if TYPE_CHECKING:
import optuna

Expand Down Expand Up @@ -443,7 +436,7 @@ def train_embeddings(
train_dataloader, loss_func, batch_size, num_unique_pairs = self.get_dataloader(
x_train, y_train, args=args, max_pairs=train_max_pairs
)
if x_eval is not None and args.evaluation_strategy != IntervalStrategy.NO:
if x_eval is not None and args.eval_strategy != IntervalStrategy.NO:
eval_max_pairs = -1 if args.eval_max_steps == -1 else args.eval_max_steps * args.embedding_batch_size
eval_dataloader, _, _, _ = self.get_dataloader(x_eval, y_eval, args=args, max_pairs=eval_max_pairs)
else:
Expand Down
37 changes: 22 additions & 15 deletions src/setfit/training_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,19 +124,19 @@ class TrainingArguments:
Whether to log and evaluate the first `global_step` or not.
logging_steps (`int`, defaults to 50):
Number of update steps between two logs if `logging_strategy="steps"`.
evaluation_strategy (`str` or [`~transformers.trainer_utils.IntervalStrategy`], *optional*, defaults to `"no"`):
eval_strategy (`str` or [`~transformers.trainer_utils.IntervalStrategy`], *optional*, defaults to `"no"`):
The evaluation strategy to adopt during training. Possible values are:
- `"no"`: No evaluation is done during training.
- `"steps"`: Evaluation is done (and logged) every `eval_steps`.
- `"epoch"`: Evaluation is done at the end of each epoch.
eval_steps (`int`, *optional*):
Number of update steps between two evaluations if `evaluation_strategy="steps"`. Will default to the same
Number of update steps between two evaluations if `eval_strategy="steps"`. Will default to the same
value as `logging_steps` if not set.
eval_delay (`float`, *optional*):
Number of epochs or steps to wait for before the first evaluation can be performed, depending on the
evaluation_strategy.
eval_strategy.
eval_max_steps (`int`, defaults to `-1`):
If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop
before reaching the set number of steps when all data is exhausted.
Expand All @@ -151,13 +151,13 @@ class TrainingArguments:
Number of updates steps before two checkpoint saves if `save_strategy="steps"`.
save_total_limit (`int`, *optional*, defaults to `1`):
If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in
`output_dir`. Note, the best model is always preserved if the `evaluation_strategy` is not `"no"`.
`output_dir`. Note, the best model is always preserved if the `eval_strategy` is not `"no"`.
load_best_model_at_end (`bool`, *optional*, defaults to `False`):
Whether or not to load the best model found during training at the end of training.
<Tip>
When set to `True`, the parameters `save_strategy` needs to be the same as `evaluation_strategy`, and in
When set to `True`, the parameters `save_strategy` needs to be the same as `eval_strategy`, and in
the case it is "steps", `save_steps` must be a round multiple of `eval_steps`.
</Tip>
Expand Down Expand Up @@ -208,7 +208,8 @@ class TrainingArguments:
logging_first_step: bool = True
logging_steps: int = 50

evaluation_strategy: str = "no"
eval_strategy: str = "no"
evaluation_strategy: str = field(default="no", repr=False, init=False) # Softly deprecated
eval_steps: Optional[int] = None
eval_delay: int = 0
eval_max_steps: int = -1
Expand Down Expand Up @@ -251,30 +252,36 @@ def __post_init__(self) -> None:
self.logging_dir = default_logdir()

self.logging_strategy = IntervalStrategy(self.logging_strategy)
self.evaluation_strategy = IntervalStrategy(self.evaluation_strategy)
if self.evaluation_strategy and not self.eval_strategy:
logger.warning(
"The `evaluation_strategy` argument is deprecated and will be removed in a future version. "
"Please use `eval_strategy` instead."
)
self.eval_strategy = self.evaluation_strategy
self.eval_strategy = IntervalStrategy(self.eval_strategy)

if self.eval_steps is not None and self.evaluation_strategy == IntervalStrategy.NO:
logger.info('Using `evaluation_strategy="steps"` as `eval_steps` is defined.')
self.evaluation_strategy = IntervalStrategy.STEPS
if self.eval_steps is not None and self.eval_strategy == IntervalStrategy.NO:
logger.info('Using `eval_strategy="steps"` as `eval_steps` is defined.')
self.eval_strategy = IntervalStrategy.STEPS

# eval_steps has to be defined and non-zero, fallbacks to logging_steps if the latter is non-zero
if self.evaluation_strategy == IntervalStrategy.STEPS and (self.eval_steps is None or self.eval_steps == 0):
if self.eval_strategy == IntervalStrategy.STEPS and (self.eval_steps is None or self.eval_steps == 0):
if self.logging_steps > 0:
self.eval_steps = self.logging_steps
else:
raise ValueError(
f"evaluation strategy {self.evaluation_strategy} requires either non-zero `eval_steps` or"
f"evaluation strategy {self.eval_strategy} requires either non-zero `eval_steps` or"
" `logging_steps`"
)

# Sanity checks for load_best_model_at_end: we require save and eval strategies to be compatible.
if self.load_best_model_at_end:
if self.evaluation_strategy != self.save_strategy:
if self.eval_strategy != self.save_strategy:
raise ValueError(
"`load_best_model_at_end` requires the save and eval strategy to match, but found\n- Evaluation "
f"strategy: {self.evaluation_strategy}\n- Save strategy: {self.save_strategy}"
f"strategy: {self.eval_strategy}\n- Save strategy: {self.save_strategy}"
)
if self.evaluation_strategy == IntervalStrategy.STEPS and self.save_steps % self.eval_steps != 0:
if self.eval_strategy == IntervalStrategy.STEPS and self.save_steps % self.eval_steps != 0:
raise ValueError(
"`load_best_model_at_end` requires the saving steps to be a round multiple of the evaluation "
f"steps, but found {self.save_steps}, which is not a round multiple of {self.eval_steps}."
Expand Down
2 changes: 1 addition & 1 deletion tests/span/test_model_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def test_model_card(absa_dataset: Dataset, tmp_path: Path) -> None:
eval_steps=1,
logging_steps=1,
max_steps=2,
evaluation_strategy="steps",
eval_strategy="steps",
)
trainer = AbsaTrainer(
model=model,
Expand Down
2 changes: 1 addition & 1 deletion tests/test_model_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def test_model_card(tmp_path: Path) -> None:
eval_steps=1,
logging_steps=1,
max_steps=2,
evaluation_strategy="steps",
eval_strategy="steps",
)
trainer = Trainer(
model=model,
Expand Down
2 changes: 1 addition & 1 deletion tests/test_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -590,7 +590,7 @@ def test_train_load_best(model: SetFitModel, tmp_path: Path, caplog: LogCaptureF
output_dir=tmp_path,
save_steps=5,
eval_steps=5,
evaluation_strategy="steps",
eval_strategy="steps",
load_best_model_at_end=True,
num_epochs=5,
)
Expand Down
12 changes: 6 additions & 6 deletions tests/test_training_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,29 +72,29 @@ def test_report_to(self):

def test_eval_steps_without_eval_strat(self):
args = TrainingArguments(eval_steps=5)
self.assertEqual(args.evaluation_strategy, IntervalStrategy.STEPS)
self.assertEqual(args.eval_strategy, IntervalStrategy.STEPS)

def test_eval_strat_steps_without_eval_steps(self):
args = TrainingArguments(evaluation_strategy="steps")
args = TrainingArguments(eval_strategy="steps")
self.assertEqual(args.eval_steps, args.logging_steps)
with self.assertRaises(ValueError):
TrainingArguments(evaluation_strategy="steps", logging_steps=0, logging_strategy="no")
TrainingArguments(eval_strategy="steps", logging_steps=0, logging_strategy="no")

def test_load_best_model(self):
with self.assertRaises(ValueError):
TrainingArguments(load_best_model_at_end=True, evaluation_strategy="steps", save_strategy="epoch")
TrainingArguments(load_best_model_at_end=True, eval_strategy="steps", save_strategy="epoch")
with self.assertRaises(ValueError):
TrainingArguments(
load_best_model_at_end=True,
evaluation_strategy="steps",
eval_strategy="steps",
save_strategy="steps",
eval_steps=100,
save_steps=50,
)
# No error: save_steps is a round multiple of eval_steps
TrainingArguments(
load_best_model_at_end=True,
evaluation_strategy="steps",
eval_strategy="steps",
save_strategy="steps",
eval_steps=50,
save_steps=100,
Expand Down

0 comments on commit 7fbd89c

Please sign in to comment.