Test: generate with `torch.compile(model.forward)` as a fast test #34544

gante · 2024-10-31T18:25:48Z

What does this PR do?

Follow-up to #34464

This PR:

Converts test_generate_compile_model_forward to a fast test. This means we will check generate with torch.compile(model.forward) at each commit on ALL models that support StaticCache 💛
Fixes failing cases of test_generate_compile_model_forward whenever possible
Tags models with _supports_static_cache = False #Reason when the model doesn't support torch.compile(model.forward)

✅ py.test tests/models/ -k test_generate_compile is all green, takes ~2 mins to run on all models on my machine

tests/models/chameleon/test_modeling_chameleon.py

tests/generation/test_utils.py

ydshieh

Love this!

Q: Is it really fast ...?

Remark: I feel get_max_cache_length is a better name than get_max_cache_shape but OK I know not great to change name all the time.

gante · 2024-10-31T18:41:25Z

Q: Is it really fast ...?

@ydshieh yes :D

HuggingFaceDocBuilderDev · 2024-10-31T18:53:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

I don't mind, tho I don't think our priority should be this (full compile vs compile forward in generate!) + I don't see the test being run in the CI! 🤗

ArthurZucker · 2024-11-05T10:03:52Z

Could you just make sure it's run

ydshieh2 · 2024-11-05T11:16:43Z

We need to remove @require_torch_gpu too for def test_generate_compile

tests/generation/test_utils.py

ydshieh · 2025-01-23T13:29:46Z

Before merge, feel free to ping me for a check for (if there is any) flakyness :-) or anything you think I can double check again.

ArthurZucker

Thanks can ignore my comments and merge 🤗

ArthurZucker · 2025-01-27T16:43:38Z

src/transformers/models/paligemma/modeling_paligemma.py

+        elif isinstance(past_key_values, HybridCache):
+            target_length = past_key_values.get_max_cache_shape()


is it possible for the HybridCache to inherit from Static cache?

We might just need an extra class that says CompileCompatible , someone wanted is_static attr˜!

gante · 2025-01-27T18:04:16Z

(sorry, the PR is not ready yet, a few cases are still failing 👀 I didn't mean to request a review)

gante · 2025-01-28T12:49:58Z

Now it's working on all models, including encoder-decoder + cache 🤗

It's not too heavy on our CI, it should add ~2 mins if all models are run. And it should prevent us from many headaches! As we can see in diff, we had compilation enabled for a bunch of models that don't support it.

fix tests

c9e3ed6

gante requested review from ydshieh and ArthurZucker October 31, 2024 18:25

gante commented Oct 31, 2024

View reviewed changes

tests/models/chameleon/test_modeling_chameleon.py Outdated Show resolved Hide resolved

ydshieh reviewed Oct 31, 2024

View reviewed changes

tests/generation/test_utils.py Outdated Show resolved Hide resolved

ydshieh approved these changes Oct 31, 2024

View reviewed changes

ArthurZucker reviewed Nov 5, 2024

View reviewed changes

Merge branch 'main' into generate_forward_compile_fix

04d5adf

gante commented Jan 22, 2025

View reviewed changes

tests/generation/test_utils.py Outdated Show resolved Hide resolved

gante and others added 4 commits January 22, 2025 11:42

Update tests/generation/test_utils.py

d789a57

make fixup

43f96af

tmp commit

3f165c8

rely on auto compilation for the tests

524b3cb

ArthurZucker approved these changes Jan 27, 2025

View reviewed changes

gante and others added 5 commits January 27, 2025 19:32

fix a few more cases (a few to go)

ab51b67

all working :D

7777cf1

Merge branch 'main' into generate_forward_compile_fix

16804ea

make fixup

0753963

add compile cache reset

4430684

gante and others added 2 commits January 28, 2025 13:50

allow compilation on cpu

bc4e8b3

Merge branch 'main' into generate_forward_compile_fix

f22e86b

gante merged commit ece8c42 into huggingface:main Jan 28, 2025
25 checks passed

gante deleted the generate_forward_compile_fix branch January 28, 2025 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test: generate with `torch.compile(model.forward)` as a fast test #34544

Test: generate with `torch.compile(model.forward)` as a fast test #34544

gante commented Oct 31, 2024 •

edited

Loading

ydshieh left a comment

gante commented Oct 31, 2024

HuggingFaceDocBuilderDev commented Oct 31, 2024

ArthurZucker left a comment

ArthurZucker commented Nov 5, 2024

ydshieh2 commented Nov 5, 2024

ydshieh commented Jan 23, 2025

ArthurZucker left a comment

ArthurZucker Jan 27, 2025

ArthurZucker Jan 27, 2025

gante commented Jan 27, 2025 •

edited

Loading

gante commented Jan 28, 2025

		elif isinstance(past_key_values, HybridCache):
		target_length = past_key_values.get_max_cache_shape()

Test: generate with torch.compile(model.forward) as a fast test #34544

Test: generate with torch.compile(model.forward) as a fast test #34544

Conversation

gante commented Oct 31, 2024 • edited Loading

What does this PR do?

ydshieh left a comment

Choose a reason for hiding this comment

gante commented Oct 31, 2024

HuggingFaceDocBuilderDev commented Oct 31, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Nov 5, 2024

ydshieh2 commented Nov 5, 2024

ydshieh commented Jan 23, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Jan 27, 2025

Choose a reason for hiding this comment

ArthurZucker Jan 27, 2025

Choose a reason for hiding this comment

gante commented Jan 27, 2025 • edited Loading

gante commented Jan 28, 2025

Test: generate with `torch.compile(model.forward)` as a fast test #34544

Test: generate with `torch.compile(model.forward)` as a fast test #34544

gante commented Oct 31, 2024 •

edited

Loading

gante commented Jan 27, 2025 •

edited

Loading