Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added cl_half support for test_conversions #1669

Closed
wants to merge 28 commits into from

Conversation

shajder
Copy link
Contributor

@shajder shajder commented Mar 9, 2023

According to issue description (conversions section):

#142

Additional remarks:

  • In the process of adding cl_khr_fp16 support source code was unified and as result heavily refactored.

  • Some pieces of code (eg. InitCL fuction) stayed as it was previously for the sake of comparison purposes. For consistency I would suggest to move it to methods (eg. ConversionsTest::SetUp).

  • Please take a closer look at method DataInfoSpec<>::init under condition is_in_half(). There is new code to initialize array of cl_half values for conversions. It was source of the problem due to some special values which brought undefined results. ATM this code avoids those problematic values.

  • Capacity of test was decreased. I had some difficult time trying to figure out why this number is so extremely big. I will appreciate your feedback. I limited number of tests to 16 777 216 conversions (uint64_t lastCase = 1000000ULL;)

  • At my machine intel driver do not implement some of kernel functions which I think should be available:

#pragma OPENCL EXTENSION cl_khr_fp16 : enable
__kernel void test_convert_half_sat_uchar( __global uchar *src, __global half *dest )
{
   size_t i = get_global_id(0);
   dest[i] = convert_half_sat( src[i] );
}
Build not successful for device "Intel(R) UHD Graphics [0x9a60]", status: CL_BUILD_ERROR
Build log for device "Intel(R) UHD Graphics [0x9a60]" is: ------------
3632:5:14: error: implicit declaration of function 'convert_half_sat' is invalid in OpenCL
   dest[i] = convert_half_sat( src[i] );

thus I wasn't able to finalize all tests successfully. I am counting on some help here, thanks!

@svenvh
Copy link
Member

svenvh commented Mar 15, 2023

In the process of adding cl_khr_fp16 support source code was unified and as result heavily refactored.

Have you considered doing this unification/refactoring in a new pull request on its own? I would recommend doing so. We should hopefully be able to merge such refactorings relatively sooner. Then the changes for actually adding half support should be smaller, which benefits reviewers (a smaller patch is easier to review) and you (a smaller patch is easier to keep up to date).

@shajder
Copy link
Contributor Author

shajder commented Mar 16, 2023

Thanks for your feedback!

Have you considered doing this unification/refactoring in a new pull request on its own? I would recommend doing so.

yes, I think I have specific commit which is good candidate to separate work on unification and extension of test's functionality - so this is still an open door for us. On the other hand there were arguments to stick to single PR:

  • it turns out that the unification code takes ~75%+ of whole task. Please correct me if I am wrong but my conclusion was that reviewing of unification PR will take approximately the same time as slightly bigger one introducing cl_khr_fp16 extension support.

  • vast majority of the code responsible for the logic of the test stayed intact. What really changed is a wrapper layer around that logic + couple of necessary conditions. Thus I don't expect any major issues to investigate. BTW I had no problems with successful launch of whole unified test.

If you think those are not good enough to stick to single PR then let me know and I will start refactoring work on splitting it.

@svenvh
Copy link
Member

svenvh commented Mar 17, 2023

If you think those are not good enough to stick to single PR then let me know and I will start refactoring work on splitting it.

I can't speak for others so please don't take my opinion too strongly. Maybe I'm the minority here who prefer multiple smaller commits/PRs that each do one thing instead of a large monolithic PR that does multiple things.

But to illustrate a point: this PR already seems to conflict with the main branch; and the larger the PR, the higher the chances of such conflicts occurring. Hence I believe that doing the refactoring separately will simplify maintenance of this PR in the long term (as we will probably need to give vendors a fair amount of time to test the new fp16 coverage on their devices, whereas generic refactorings shouldn't need as much time).

@svenvh svenvh self-requested a review March 27, 2023 09:17
@bcalidas
Copy link

We are still reviewing this change.

@lakshmih
Copy link

We would like to double check results at our end, can we get some more time.

@lakshmih
Copy link

lakshmih commented May 5, 2023

We are having trouble testing and verifying this PR. Could this PR be split into two changes, one for the code refactoring and the other for enhancing fp16 coverage?

@shajder
Copy link
Contributor Author

shajder commented May 9, 2023

We are having trouble testing and verifying this PR. Could this PR be split into two changes, one for the code refactoring and the other for enhancing fp16 coverage?

@lakshmih here is PR with modernized conversions without cl_khr_fp16 support

Copy link
Contributor

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased this PR, now the diff is a lot more manageable for review 😃

I'm seeing fails on the half tests when running this patch however. When I try the commit before I merged in main I see the same fails, so I don't think i've done the rebase wrong, but it's a possibility (we can always revert the merge commit to rollback).

Marking this as "Request Changes" since I'd like to investigate the fails i'm seeing further.

@bashbaug
Copy link
Contributor

bashbaug commented Sep 8, 2023

FYI, I'm seeing failures here too, looking into them now.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found and fixed a few problems and managed to get a little farther but this still isn't working for me. Here are a few issues I found.

@bashbaug
Copy link
Contributor

This PR needs a new owner, so for now I've removed "focused review" and added "help wanted".

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments seem to have been addressed - thanks!

I think it's a bit error-prone to need to check both the "type" and the "fp" bool to determine if the type is a "half", but I believe it's all correct now, so we can always tidy this up later.

The saturate issue mentioned above does need to be fixed, but I'm not seeing any failures with the fix. For reference, I'm running:

test_conversions -w -1

This finishes in reasonable time. If I think of it I'll kick off a more thorough test overnight.

@bashbaug
Copy link
Contributor

I spoke too soon, and I'm seeing a few failures in double to half conversions, for non-round-to-even rounding modes:

Example for: test_conversions half_rtn_double -w -1

Building convert_half_rtn( double ) test
Testing... .
Error for vector size 1 found at 0x00003f76:  *-0x1.8p+1 vs -0x1.804p+1
Input value: -0x1.8000000000001p+1  (convert_half_rtn( double ))
         *** convert_halfn_rtn( doublen ) FAILED ** 
conversions FAILED

I'm checking to see if this is an issue with our implementation or with the tests.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is at least one issue related to double-to-half conversions.

If you need help reproducing these issues on an Intel GPU without native fp64 support please see:

https://github.com/intel/compute-runtime/blob/master/opencl/doc/FAQ.md#feature-double-precision-emulation-fp64

test_conformance/conversions/conversions_data_info.h Outdated Show resolved Hide resolved
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of my issues are addressed - thanks!

Copy link
Contributor

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not reviewed the latest changes in depth, but when I tried the most recent version of the branch the fails I saw have gone. LGTM.

@lakshmih
Copy link

We are running this test and there are no failures so far. However it takes quite a bit of time to complete so can we hold off merging so long as Qualcomm agrees to see this through.

@bashbaug
Copy link
Contributor

I think we should merge this as-is and fix any bugs that appear.

@lakshmih can you run a "wimpy" test first to gain confidence that the changes are OK? Maybe even a scalar-only "wimpy" test? This finishes fairly promptly on our devices.

Example: test_conversions -w -1

@bashbaug
Copy link
Contributor

I just created an fp16-staging branch, as we discussed. If this PR is retargeted at this branch I will merge it, also as we discussed.

It's been a few days now though. @lakshmih has your testing completed? If so, would we be OK to merge this to the main branch, instead?

@shajder
Copy link
Contributor Author

shajder commented Dec 18, 2023

I just created an fp16-staging branch, as we discussed. If this PR is retargeted at this branch I will merge it, also as we discussed.

@bashbaug I've created new PR to be merged with fp16-staging branch

@bashbaug
Copy link
Contributor

Removing "focused review", since we are reviewing and merging changes into the staging branch now.

@shajder
Copy link
Contributor Author

shajder commented Jun 19, 2024

This PR was merged to main branch along with #1975, closing

@shajder shajder closed this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants