-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduced the number of graph rebuilds #2011
Conversation
run pylint pre-commit tests |
c724f9b
to
2d1403e
Compare
6cdfc5f
to
83b9a78
Compare
def apply( | ||
self, | ||
model: TModel, | ||
graph: NNCFGraph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should user create the graph to pass it? Should it use the NNCFGraphFactory
factory to build it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the introduction of a simple API to create a model graph will be in a follow up PR.
@@ -484,10 +486,11 @@ def output_filter_func(point): | |||
output_fp.extend(tensor_collector.get_statistics().mean_values) | |||
return np.array(output_fp) | |||
|
|||
def get_statistic_points(self, model: TModel) -> StatisticPointsContainer: | |||
def get_statistic_points(self, model: TModel, graph: NNCFGraph) -> StatisticPointsContainer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next question is not related to these changes but I think it is important. Is the model
parameter a quantized model or an initial one? The following code snippet makes one think that this model with quantizers
model_copy = self._backend_entity.remove_fq_from_inputs(copy_model(model), graph)
In this case, I really don't understand how the PostTrainingQuantization.get_statistic_points()
method works because it takes an initial model (not quantized). It makes sense to discuss this statement offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! @KodiaqQ, could you answer on this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be true, the model here, for BC, is without quantizers, and remove_fq_from_inputs
may be removed. But let's keep it for the next PRs because BC algo is not the main part of DQ.
@@ -189,13 +191,15 @@ def _create_statistics_aggregator(self, dataset: Dataset, backend: BackendType) | |||
return PTStatisticsAggregator(dataset) | |||
return None | |||
|
|||
def _apply( | |||
def apply( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update PostTrainingQuantization.get_statistic_points()
method as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I updated PostTrainingQuantization.get_statistic_points()
taking into account PR #2013, because PostTrainingQuantization.get_statistic_points()
function has incorrect implementation anyway.
Changes
Extended the signature of algorithm methods by NNCFGraph. This changes shows quantization speed-up in 2.34x for "hf-internal-testing/tiny-random-GPTNeoXForCausalLM" model from optimum.
Reason for changes
Reduced the number of graph rebuilds
Related tickets
ref: 113245
Tests
N/A