Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TorchFX] INT8 Weights Compression Support #2891

Merged
merged 92 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
297fdb4
weights compression init
anzr299 Aug 14, 2024
534e294
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Aug 16, 2024
06ca5a3
compression complete
anzr299 Aug 16, 2024
b4b2603
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Aug 19, 2024
c770d2c
Modify graph builder to include support for embedding op
anzr299 Aug 19, 2024
70b00f9
modify function to set new node meta for new module insertion to fx g…
anzr299 Aug 19, 2024
c7fa7f2
Add weights compression support for torch fx
anzr299 Aug 19, 2024
667b8a5
Add test for torch fx weights compression
anzr299 Aug 19, 2024
dca2374
reorder comments
anzr299 Aug 19, 2024
6f693c9
variable names fix
anzr299 Aug 19, 2024
159a615
Fix messages, use transformation for updating weight
anzr299 Aug 19, 2024
7a896d6
Minor mypy fix
anzr299 Aug 19, 2024
0de1d9b
fix set_weight
anzr299 Aug 19, 2024
f9e5d7c
Update torch_fx_backend.py
anzr299 Aug 20, 2024
443dce7
Add embedding metatype for torch fx as a subtype
anzr299 Aug 20, 2024
03d16f8
replace embedding metatype with torch fx subtype in torch fx graph bu…
anzr299 Aug 20, 2024
5226934
1. Adjust the torch fx weights compression backend to use fx embeddin…
anzr299 Aug 20, 2024
3cdb7b3
Update test for weight compression. Include test to see if
anzr299 Aug 20, 2024
28f7053
Fix FX metatype mapping
anzr299 Aug 20, 2024
cb0bf6b
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Aug 20, 2024
8b3c6e2
Add metatypes registry for torch fx specific embedding metatype and c…
anzr299 Aug 20, 2024
79ec939
Add copyright to new torch fx operator_metatypes file
anzr299 Aug 20, 2024
7accaf2
Add weights compression graph test
anzr299 Aug 26, 2024
5b11455
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 26, 2024
1cb55c2
Merge branch 'develop' of https://github.com/anzr299/nncf into fx_com…
anzr299 Aug 26, 2024
71c50ff
pre-commit fix
anzr299 Aug 26, 2024
9f68831
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Aug 28, 2024
2cb0a41
Handle Lora correction in torch fx weights compression
anzr299 Aug 28, 2024
a9c3d57
Add graph test for compressed models in test_models
anzr299 Aug 28, 2024
0172ad1
pre commit fix
anzr299 Aug 28, 2024
f590200
1. Moved Embedding FX metatype from `experimental/torch/fx` to torch …
anzr299 Aug 29, 2024
0c7be62
shared weights support in torch fx graph builder and constant update …
anzr299 Aug 29, 2024
0a1157d
Update tests for more description
anzr299 Aug 29, 2024
0eff5cb
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
c7b9093
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
93ecc4e
add torch fx in supported backends
anzr299 Aug 30, 2024
b6ad458
Remove Compressed reference graphs
anzr299 Aug 30, 2024
e7097bd
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
64b9ba7
add test for shared weights
anzr299 Sep 2, 2024
2665666
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 2, 2024
fb74267
Merge branch 'develop' of https://github.com/anzr299/nncf into fx_com…
anzr299 Sep 2, 2024
287cb2c
pre-commit fix
anzr299 Sep 2, 2024
449f767
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 3, 2024
c79dfc2
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 4, 2024
a10cb68
Add test for shared node decompressor call
anzr299 Sep 4, 2024
1c144a5
update backend supported in docs
anzr299 Sep 4, 2024
c5291b7
pre-commit fix
anzr299 Sep 4, 2024
174fb32
remove todo
anzr299 Sep 4, 2024
45a5274
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 10, 2024
b46d00e
add get_dtype and get_shape methods to torch fx weights compression b…
anzr299 Sep 10, 2024
32f5098
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 12, 2024
1819241
get the updated constant name from graph
anzr299 Sep 16, 2024
8a6b6d5
updated constant name from graph
anzr299 Sep 16, 2024
502c6c3
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 16, 2024
3503674
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 17, 2024
71901c5
update shared constants transformation
anzr299 Sep 20, 2024
bd5ff1f
pre commit fix
anzr299 Sep 20, 2024
b6a29ab
update docs
anzr299 Sep 20, 2024
7dd9782
refactor get weight name and port ids
anzr299 Sep 20, 2024
bbfeff0
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 20, 2024
48848be
update docs from X to Torch FX
anzr299 Sep 20, 2024
20544fd
fix shared weights attribute
anzr299 Sep 20, 2024
60ef615
Merge branch 'fx_compress_weights' of https://github.com/anzr299/nncf…
anzr299 Sep 20, 2024
fb89a4d
Fix Suggestions
anzr299 Sep 20, 2024
002758b
pre commit fix
anzr299 Sep 20, 2024
fe4d390
update is_shared attribute
anzr299 Sep 20, 2024
2ca11f8
Add tests for cosntant update transformation
anzr299 Sep 20, 2024
2be2487
pre commit fix
anzr299 Sep 20, 2024
fc543c9
Add test for edge shape
anzr299 Sep 20, 2024
02861e9
make decompressor name more readible
anzr299 Sep 20, 2024
33afddb
fix model_devices and precision test
anzr299 Sep 20, 2024
15bfeb0
Update is_shared attribute using a one liner
anzr299 Sep 20, 2024
04ed994
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 23, 2024
7683b5d
add test for nncf node is_shared attribute before applying transforma…
anzr299 Sep 23, 2024
fa56e7e
Change code to include _capture_model function for torch FX graph cap…
anzr299 Sep 23, 2024
fd9498a
pre-commit fix
anzr299 Sep 23, 2024
782b509
Fix is_shared attribute test
anzr299 Sep 23, 2024
48d050b
pre- commit fix
anzr299 Sep 23, 2024
3477d7c
add reference for checking shared constant unification transformation
anzr299 Sep 23, 2024
cbc2106
Add synthetic model with embedding to test models and include create …
anzr299 Sep 23, 2024
229517c
add reference graphs
anzr299 Sep 23, 2024
fde56b7
Include assert in shared attribute test
anzr299 Sep 24, 2024
30ff3d2
Fix reference graphs structure
anzr299 Sep 24, 2024
f26a7a0
pre-commit fix
anzr299 Sep 24, 2024
1d0a866
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 24, 2024
49d3dec
Change FXEmbedding metatype to PTAtenEmbeddingMetatype
anzr299 Sep 24, 2024
2e7e639
Move shared constants unification transformation to `apply_quantizati…
anzr299 Sep 24, 2024
817c233
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 25, 2024
26a4ff4
Corrections, comments and refactoring
anzr299 Sep 26, 2024
065bacb
Add seperate error message for dataset attribute
anzr299 Sep 26, 2024
3942d45
fix comments
anzr299 Sep 26, 2024
14096b7
Merge branch 'openvinotoolkit:develop' into fx_compress_weights
anzr299 Sep 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion nncf/experimental/torch/fx/nncf_graph_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,6 @@ def create_nncf_graph(model: torch.fx.GraphModule) -> PTNNCFGraph:
for source_node in model.graph.nodes:
node_type, node_metatype = GraphConverter._get_node_type_and_metatype(source_node, model)
node_metatype = GraphConverter._map_fx_unique_metatypes(source_node, node_metatype)
is_shared_node = False
is_shared_node = source_node.op in ("get_attr",) and (
const_targets_counter[source_node.target] > 1 or len(source_node.users) > 1
)
Expand Down
6 changes: 4 additions & 2 deletions nncf/experimental/torch/fx/transformations.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,10 @@ def bias_update_transformation(model: torch.fx.GraphModule):

def shared_constants_unification_transformation(model: torch.fx.GraphModule):
"""
checks fx graph for shared constants, disconnects and eliminates redundant
shared constant while connecting singular shared constant.
checks fx graph for shared constants and eliminates redundant
shared constant while keeping only the first instance of the constant node.
This unification transformation is cruicial since the current algorithms(min_max, solver, BC, etc.)
for torch fx do not utilize the is_shared attribute of nodes for shared constants.

:param model: Target Torch FX GraphModule
:return: Transformation which attaches shared constants to nodes and removes redundant constants.
Expand Down
5 changes: 3 additions & 2 deletions nncf/quantization/quantize_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -501,9 +501,10 @@ def compress_weights(
f"but given {mode.value} mode."
)

if any((awq, scale_estimation, gptq, lora_correction)):
if any((awq, scale_estimation, gptq, lora_correction, dataset)):
alexsu52 marked this conversation as resolved.
Show resolved Hide resolved
raise AttributeError(
"TorchFX backend does not support 'awq', 'scale_estimation', 'gptq' and 'lora_correction' options. "
"TorchFX backend does not support 'awq', 'scale_estimation', 'gptq',"
"'dataset' and 'lora_correction' options. "
"Set them to None."
)
compression_weights_impl = fx_compression_weights_impl
Expand Down
17 changes: 9 additions & 8 deletions tests/torch/fx/test_compress_weights.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

from nncf import CompressWeightsMode
from nncf.common.factory import NNCFGraphFactory
from nncf.data.dataset import Dataset
from nncf.experimental.torch.fx.node_utils import get_tensor_constant_from_node
from nncf.quantization import compress_weights
from nncf.torch.dynamic_graph.patch_pytorch import disable_patching
Expand Down Expand Up @@ -72,7 +73,7 @@ def _capture_model(model, inputs):
return capture_pre_autograd_graph(model, (inputs,))


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compress_weights(mode):
model = ShortTransformer(5, 10)
input_ids = torch.randint(0, 10, (5,))
Expand All @@ -89,7 +90,7 @@ def test_compress_weights(mode):
assert n_target_modules == n_compressed_weights


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compress_weights_graph_edge(mode):
model = ShortTransformer(5, 10)
input_ids = torch.randint(0, 10, (5,))
Expand All @@ -103,7 +104,7 @@ def test_compress_weights_graph_edge(mode):
assert decompressor_node_edge.tensor_shape == decompressor_constant_edge.tensor_shape


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compress_weights_shared_weights(mocker, mode):
with disable_patching():
model = ShortTransformer(5, 10, share_weights=True)
Expand Down Expand Up @@ -136,7 +137,7 @@ def test_compress_weights_shared_weights(mocker, mode):
assert spy.call_count == 1


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compressed_model_inference(mode):
torch.manual_seed(42)
model = ShortTransformer(5, 10, share_weights=True)
Expand All @@ -152,7 +153,7 @@ def test_compressed_model_inference(mode):
assert torch.all(torch.isclose(exported_model_output, compressed_model_outputs, atol=1)).item()


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compress_weights_model_size_conv(mode):

dtype = torch.int8 if mode == CompressWeightsMode.INT8_SYM else torch.uint8
Expand All @@ -176,7 +177,7 @@ def test_compress_weights_model_size_conv(mode):
assert compressed_model_size < model_size


@pytest.mark.parametrize("mode", (CompressWeightsMode.INT8_SYM, CompressWeightsMode.INT8_ASYM))
@pytest.mark.parametrize("mode", SUPPORTED_MODES)
def test_compress_weights_functional_model(mode):
model = FunctionalModel()
decompressor_type = "symmetric" if mode == CompressWeightsMode.INT8_SYM else "asymmetric"
Expand Down Expand Up @@ -206,13 +207,13 @@ def test_compress_weights_functional_model(mode):
{"awq": True},
{"scale_estimation": True},
{"lora_correction": True},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subset_size and dataset are also not supported.
If #2978 will be merged before this PR, there's also a backup_precision parameter.

Copy link
Contributor Author

@anzr299 anzr299 Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case of backup_precision, I can leave it for now depending on if the PR is merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following @alexsu52's guidance, the subset_size parameter is ignored when the dataset is None in the WeightsCompression Algorithm. For this reason, I didn't add a check for subset_size, but I did include a check for dataset.

{"dataset": Dataset([1])},
),
)
def test_raise_error_with_unsupported_params_for_int8(mode, params):
dummy_torch_model = EmptyModel()
dummy_input = torch.Tensor()
with disable_patching():
exported_model = capture_pre_autograd_graph(dummy_torch_model, args=(dummy_input,))
exported_model = _capture_model(dummy_torch_model, dummy_input)
with pytest.raises(AttributeError):
compress_weights(exported_model, mode=mode, **params)

Expand Down
Loading