Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sptrsv stream test fixes #2444

Merged
merged 1 commit into from
Dec 2, 2024
Merged

Conversation

jgfouca
Copy link
Contributor

@jgfouca jgfouca commented Nov 27, 2024

SPTRSV_CUSPARSE algorithm is not supported for streams, so it made no sense to add it to the list of tested algs inside test_sptrsv_streams. The result was tri_solve_streams being a no-op for this algorithm. Somehow, this was not caught until the block algorithm was being used.

Also, some minor cleanup of alg enum handling in the sptrsv handle. Use a switch statement with a default to catch unhandled enum vals. print_algorithm should just use the alg string to avoid a duplicated switch/ifelseif chain. StringToSPTRSVAlgorithm was not handling several of the enum vals and also returning strings inconsistent with the strings in return_algorithm_string. Grep revealed no one using this function, so I removed it.

I added a check in tri_solve_streams to throw an error if an unsupported alg is used.

Fixes #2442

@jgfouca jgfouca added AT: PRE-TEST INSPECTED Mark this PR as approved for testing. AT2-CI-APPROVAL Approve CI to run at SNL labels Nov 27, 2024
SPTRSV_CUSPARSE algorithm is not supported for streams, so it
made no sense to add it to the list of tested algs inside
test_sptrsv_streams. The result was tri_solve_streams being
a no-op for this algorithm. Somehow, this was not caught until
the block algorithm was being used.

Also, some minor cleanup of alg enum handling in the sptrsv handle.
Use a switch statement with a default to catch unhandled enum vals.
print_algorithm should just use the alg string to avoid a duplicated
switch/ifelseif chain. StringToSPTRSVAlgorithm was not handling
several of the enum vals and also returning strings inconsistent
with the strings in return_algorithm_string. Grep revealed no one
using this function, so I removed it.

I added a check in tri_solve_streams to throw an error if an unsupported
alg is used.

Signed-off-by: James Foucar <[email protected]>
@jgfouca jgfouca force-pushed the jgfouca/sptrsv_stream_fixes branch from 2f57db4 to 24ded58 Compare November 28, 2024 00:15
@ndellingwood
Copy link
Contributor

The H100 job failed due to CI issue (not code-related):

From the configure_kokkos stage:

Unable to determine the device handle for GPU0000:B8:00.0: Unknown Error
Error: Process completed with exit code 255.

Blake H100 nightly jobs are still running fine, and there was no issue with the runner setup, I'm not sure is this a container issue?

Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me as well

@lucbv lucbv merged commit 78f4efd into kokkos:develop Dec 2, 2024
17 of 18 checks passed
@jgfouca jgfouca deleted the jgfouca/sptrsv_stream_fixes branch December 3, 2024 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AT: PRE-TEST INSPECTED Mark this PR as approved for testing. AT2-CI-APPROVAL Approve CI to run at SNL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nightly test failure, Cuda builds with Cu* TPLs enabled, Cuda.sparse_sptrsv_double_int_int_TestDevice
4 participants