Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainer exits with PermissionError if cwd is not writeable #562

Open
bluestealth opened this issue Oct 2, 2024 · 0 comments
Open

Trainer exits with PermissionError if cwd is not writeable #562

bluestealth opened this issue Oct 2, 2024 · 0 comments

Comments

@bluestealth
Copy link

bluestealth commented Oct 2, 2024

v1.1.0

This is similar to #559, I am running setfit in a container and the exec starts in a location that is not writeable but the current user. This results in a PermissionError at runtime.

I am able to replicate this locally using the example even if output_dir is set in TrainingArguments by chowning the execdir to another user.

bad_dir % python3 example.py
Using the latest cached version of the dataset since sst2 couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /Users/bluestealth/.cache/huggingface/datasets/sst2/default/0.0.0/8d51e7e4887a4caaa95b3fbebbf53c0490b58bbb (last modified on Tue Oct  1 18:57:42 2024).
/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1617: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be deprecated in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
model_head.pkl not found on HuggingFace Hub, initialising classification head with random weights. You should TRAIN this model on a downstream task to use it for predictions and inference.
Applying column mapping to the training dataset
Applying column mapping to the evaluation dataset
Traceback (most recent call last):
  File "/Users/bluestealth/testing-setfit/bad_dir/example.py", line 27, in <module>
    trainer = Trainer(
              ^^^^^^^^
  File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/setfit/trainer.py", line 328, in __init__
    self.st_trainer = BCSentenceTransformersTrainer(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/setfit/trainer.py", line 48, in __init__
    super().__init__(model=setfit_model.model_body, **kwargs)
  File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/sentence_transformers/trainer.py", line 201, in __init__
    super().__init__(
  File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/transformers/trainer.py", line 611, in __init__
    os.makedirs(self.args.output_dir, exist_ok=True)
  File "<frozen os>", line 225, in makedirs
PermissionError: [Errno 13] Permission denied: 'tmp_trainer'

This is because before settings the arguments passed in super.__init__() is called.
Since no TrainingArgs are passed in, it default to output_dir being "tmp_trainer" in the sentence transformer trainer. Then, when sentence transformers calls super.__init__() the transformers trainer tries to create the output_dir causing the error above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant