-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU] Add Clamp for FakeConvertDecomposition #28651
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -175,6 +175,9 @@ std::vector<std::string> disabledTestPatterns() { | |||||||||||||
R"(.*ConvertCPULayerTest.*outFmts=(nhwc|nChw8c|nChw16c).*)", | ||||||||||||||
// Issue: MFDNN-12917. The oneDNN emitter of conversion from fp32 to fp8 has rounding issue. | ||||||||||||||
R"(.*ConvertCPULayerTest.*(\[1.1.1080.1920\]|\(2.17.5.4\))_.*_inputPRC=f32_targetPRC=f8e4m3_.*)", | ||||||||||||||
// Issue: 123320 | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These bf16 related test cases for FakeConvert are skipped because of the following code block, which has already been tracked by ticket 123320. The code block here openvino/src/plugins/intel_cpu/tests/functional/shared_tests_instances/core_config.cpp Lines 21 to 26 in 7f56fcd
Here is a case for FakeConvert layer. data=3504, scale=0.546875, shift=0, destination_type=f8e5m2. Expected result is 3264, actual result is 3744, but both correct. Expected: bf16(3504) -> fp32(3504) -> *scale(0.546875) ->fp32(1916.25) -> fp16(1916) -> fp16(constrained_in_f8_range)(1792) -> fp32(1792) -> /scale(0.546875) -> fp32(3276.8) -> bf16(3264) Actual: bf32(3504) -> *scale(0.546875) -> bf16(1920) -> fp16(1920) -> fp16(constrained_in_f8_range)(2048) -> bf16(2048) -> /scale(0.546875) ->bf16(3744) |
||||||||||||||
// Input precision bf16 is converted to fp32 by logic in core_config.cpp during ngraph reference test. | ||||||||||||||
R"(.*FakeConvertLayerTest.*dataPrecision=bf16.*)", | ||||||||||||||
// Need to generate sequence exactly in the i64 data type. Enable in scope of i64 enabling. | ||||||||||||||
R"(.*RandomUniformLayerTestCPU.*OutPrc=i64.*)", | ||||||||||||||
// Issue: 123815 (Tests are sensintive to available thread count on testing machines) | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, the behavior of master branch for converting different f32 values to f8e4m3 is as follows:
For overflowed values, if we apply clamp in FakeConvertDecomposition we gets not-INF, otherwise we gets INF. We can not get different behavior for "f16 overflowed value" and "f8e5m2 overflowed value".
So here, I applies this revision to make "f16 overflowed value" also get not-INF result. The other reason is to align with behavior of f8e4m3.
@dmitry-gorokhov @mitruska Please feel free to comment. Thanks a lot!