Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to only use exact cache hit #177

Open
Nemo157 opened this issue Nov 9, 2023 · 3 comments
Open

Option to only use exact cache hit #177

Nemo157 opened this issue Nov 9, 2023 · 3 comments

Comments

@Nemo157
Copy link

Nemo157 commented Nov 9, 2023

On docs.rs we've recently run into disk space issues in CI. Debugging it I noticed that we had a very large cache (3GB compressed), looking at the lists of files there were many duplicate versions of crates included in it. I believe the main cause was having multiple large sets of dependency updates merged within a week, so the previous versions were not being cleaned when restoring a partially-matched cache. After setting an explicit prefix-key to purge the cache it dropped down to just 1.2GB.

We don't do many frequent dependency updates, instead batching them up, so the partial cache matches aren't generally that useful for us; having the cache not carrying the extra weight for a week after the update is more important.

@Nemo157
Copy link
Author

Nemo157 commented Nov 9, 2023

Another idea I had to solve our issue was an option to not restore the cache. We could then pass a conditional expression that enabled this just for our dependency updates, so other PRs that modify versions would still use a partial cache and just the mass update PRs would flush the cache out to a new state.

@max-sixty
Copy link

We've recently been having similar issues with our rust builds (here's our workflow) — the cache isn't great about removing unused dependencies, and so these grow over time and eventually CI breaks.

I now manually remove the caches and re-run, which works but is a bit inconvenient and requires me to watch for breaks.

Another idea I had to solve our issue was an option to not restore the cache. We could then pass a conditional expression that enabled this just for our dependency updates, so other PRs that modify versions would still use a partial cache and just the mass update PRs would flush the cache out to a new state.

Given that it's probably difficult to guarantee removing unused dependencies, this would be great! Requires some GHA expressions so not without downsides, but would be great for us.

@max-sixty
Copy link

This is starting to be quite a big issue for us — sometimes after we update dependencies then PRs will all fail.

Has anyone found a good solution?

One option for us would be to use a hash of Cargo.lock as a prefix key — that way we won't get any re-use of caches. This means that any dependency change will trigger a whole recompilation, which is obviously more costly, but would also prevent the current accumulation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants