Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 #525

Open
9 tasks
Tracked by #101
jrhemstad opened this issue Oct 5, 2023 · 11 comments · May be fixed by #3361
Open
9 tasks
Tracked by #101

Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 #525

jrhemstad opened this issue Oct 5, 2023 · 11 comments · May be fixed by #3361
Assignees

Comments

@jrhemstad
Copy link
Collaborator

jrhemstad commented Oct 5, 2023

The CUDA extended floating point types __half and __nv_bfloat16 and fp8 (and others) are important types for many CUDA C++ developers.

As a CUDA C++ developer, I'd like it if relevant CCCL utilities like <type_traits>, atomic<T>, complex<T> all worked with these types.

Tasks

@srinivasyadav18
Copy link
Contributor

Hi @jrhemstad,
This looks like an interesting issue to me. I would like to contribute to this Issue.

@jrhemstad
Copy link
Collaborator Author

Hey @srinivasyadav18, thanks for your interest in helping make CCCL better!

@griwes was just starting to look into this issue. He'll have a better idea of the details of what will be required and what parts you could help out with. For example, specializing complex vs <type_traits> will be different tasks.

@srinivasyadav18
Copy link
Contributor

@jrhemstad Thanks! I will coordinate with @griwes to see what I can help.
I have done some initial work enabling type_traits for __half and __nv_bfloat16, I will share the link to the branch in some time soon.

@srinivasyadav18
Copy link
Contributor

Hi @griwes, I made these initial changes which enables __half and __nv_bfloat16. I am exactly not sure if it is the right way to include <cuda_fp16.h> and <cuda_bf16.h> in __type_traits/is_floating_point.h. Please let me know if I am missing anything here. Thanks! :)

@jrhemstad
Copy link
Collaborator Author

I am exactly not sure if it is the right way to include <cuda_fp16.h> and <cuda_bf16.h> in __type_traits/is_floating_point.h

I'm guessing we're going to have to be more careful about how we include those headers because we support versions of the CTK that may not have those headers yet. So it'll require some careful ifdefs. Here's an example from CUB: https://github.com/NVIDIA/cub/blob/0fc3c3701632a4be906765b73be20a9ad0da603d/cub/util_type.cuh#L43C1-L48

@miscco @gevtushenko may be able to help figure out the right way to guard including those headers.

@jrhemstad jrhemstad changed the title Specialize relevant cuda::(std::) types for __half/bfloat16 Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 Dec 6, 2023
@jrhemstad
Copy link
Collaborator Author

@gevtushenko says that CUB already has some of the relevant values for <limits> that can be used.

@ngc92
Copy link

ngc92 commented Apr 27, 2024

std::numeric_limits<half> would be quite handy in several places (e.g., preventing overflows, unit tests over templated kernels that adjust their precision requirements based on epsilon). Even rolling your own is not really possible, because of the lack of constexpr constructors for half and bfloat16.

Is there a reason this needs to happen in cuda::std? AFAIK, you are allowed to specialize this directly in std::.
From cppreference:

Implementations may provide specializations of std::numeric_limits for implementation-specific types: e.g. GCC provides std::numeric_limits<__int128>. Non-standard libraries may add specializations for library-provided types, e.g. OpenEXR provides std::numeric_limits for a 16-bit floating-point typ

@shangz-ai
Copy link

Hello team,
Can we add overloads of cuda::std::frexp for the extended floating point types? This is needed in the case of pytorch frexp_cuda.
See pytorch/pytorch#133313

@bernhardmgruber
Copy link
Contributor

@gevtushenko says that CUB already has some of the relevant values for <limits> that can be used.

Yes, this is cub::Traits from util_type.cuh. We should deprecate and replace it by <cuda/std/type_traits> when it's ready.

@miscco
Copy link
Collaborator

miscco commented Nov 13, 2024

Yes see #2749 where I started adding more implementations for extended floating point types

@fbusato
Copy link
Contributor

fbusato commented Nov 13, 2024

Adding extended floating-point to cuda::std::numeric_limits is not straightforward. Many operations are not even constexpr (contrary to the C++ standard)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

8 participants