-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] [GHA] HuggingFace cache #28481
Open
akashchi
wants to merge
24
commits into
openvinotoolkit:master
Choose a base branch
from
akashchi:ci/gha/hf-cache-test
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[CI] [GHA] HuggingFace cache #28481
+60
−25
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…nto ci/gha/hf-cache-test
akashchi
commented
Feb 5, 2025
Comment on lines
27
to
33
os.environ['HF_HUB_CACHE'] = hf_cache_dir | ||
|
||
no_clean_cache_dir = False | ||
hf_hub_cache_dir = tempfile.gettempdir() | ||
hf_hub_cache_dir = hf_cache_dir | ||
if os.environ.get('USE_SYSTEM_CACHE', 'True') == 'False': | ||
no_clean_cache_dir = True | ||
os.environ['HUGGINGFACE_HUB_CACHE'] = hf_hub_cache_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not sure how it was supposed to work in the first place:
- There are two env variables for HF cache:
HF_HUB_CACHE
&HUGGINGFACE_HUB_CACHE
, the latter is deprecated but maybe needed for backwards compatibility HF_HUB_CACHE
is taken from the environment and if not present -> a temp directory is used insteadHUGGINGFACE_HUB_CACHE
was always set to a created temporary directory, w/o even looking for it in the env. what if we want to use a remote cache like in CI?- The cleanup is controlled by another env variable
USE_SYSTEM_CACHE
but only for a deprecatedHUGGINGFACE_HUB_CACHE
Via the changes in this PR, I set HF_HUB_CACHE
as a single source of truth but I think it could and should be simplified further. Is HUGGINGFACE_HUB_CACHE
even needed? I think it could be done like:
- Get only
HF_HUB_CACHE
from the env:- if present, just use the value
- If not present -> set it to the temp directory
- Drop
HUGGINGFACE_HUB_CACHE
/ set it toHF_HUB_CACHE
- Rename
USE_SYSTEM_CACHE
into something likeCLEAN_HF_CACHE
/KEEP_HF_CACHE
/...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
category: CI
OpenVINO public CI
category: JAX FE
OpenVINO JAX FrontEnd
category: PyTorch FE
OpenVINO PyTorch Frontend
category: TF FE
OpenVINO TensorFlow FrontEnd
github_actions
Pull requests that update GitHub Actions code
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The new HuggingFace share was added to the self-hosted runners with the path being
mount/caches/huggingface
. This PR adds this share to workflows that use HF data.Tickets: