Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8` #525

jrhemstad · 2023-10-05T19:30:53Z

The CUDA extended floating point types __half and __nv_bfloat16 and fp8 (and others) are important types for many CUDA C++ developers.

As a CUDA C++ developer, I'd like it if relevant CCCL utilities like <type_traits>, atomic<T>, complex<T> all worked with these types.

Tasks

Specializations of complex<T> for half and bfloat #1139
Specializations for <limits> for half and bfloat16 #3044
Extend cuda::std::numeric_limits for __half, __nv_bfloat16, __nv_fp8_e4m3, __nv_fp8_e5m2 and deprecate cub::FpLimits #3349
Specializations for <type_traits>
Specializations of complex for fp8?
Specializations for atomic<T>
Overloads for functions for cudart extended floating-point types
Implement cub::Traits in terms of the new <type_traits>
Deprecate cub::Traits

The text was updated successfully, but these errors were encountered:

srinivasyadav18 · 2023-10-21T16:29:02Z

Hi @jrhemstad,
This looks like an interesting issue to me. I would like to contribute to this Issue.

jrhemstad · 2023-10-27T13:32:40Z

Hey @srinivasyadav18, thanks for your interest in helping make CCCL better!

@griwes was just starting to look into this issue. He'll have a better idea of the details of what will be required and what parts you could help out with. For example, specializing complex vs <type_traits> will be different tasks.

srinivasyadav18 · 2023-10-27T13:48:44Z

@jrhemstad Thanks! I will coordinate with @griwes to see what I can help.
I have done some initial work enabling type_traits for __half and __nv_bfloat16, I will share the link to the branch in some time soon.

srinivasyadav18 · 2023-10-27T17:35:18Z

Hi @griwes, I made these initial changes which enables __half and __nv_bfloat16. I am exactly not sure if it is the right way to include <cuda_fp16.h> and <cuda_bf16.h> in __type_traits/is_floating_point.h. Please let me know if I am missing anything here. Thanks! :)

jrhemstad · 2023-10-27T17:41:29Z

I am exactly not sure if it is the right way to include <cuda_fp16.h> and <cuda_bf16.h> in __type_traits/is_floating_point.h

I'm guessing we're going to have to be more careful about how we include those headers because we support versions of the CTK that may not have those headers yet. So it'll require some careful ifdefs. Here's an example from CUB: https://github.com/NVIDIA/cub/blob/0fc3c3701632a4be906765b73be20a9ad0da603d/cub/util_type.cuh#L43C1-L48

@miscco @gevtushenko may be able to help figure out the right way to guard including those headers.

jrhemstad · 2023-12-06T18:59:59Z

@gevtushenko says that CUB already has some of the relevant values for <limits> that can be used.

ngc92 · 2024-04-27T10:48:41Z

std::numeric_limits<half> would be quite handy in several places (e.g., preventing overflows, unit tests over templated kernels that adjust their precision requirements based on epsilon). Even rolling your own is not really possible, because of the lack of constexpr constructors for half and bfloat16.

Is there a reason this needs to happen in cuda::std? AFAIK, you are allowed to specialize this directly in std::.
From cppreference:

Implementations may provide specializations of std::numeric_limits for implementation-specific types: e.g. GCC provides std::numeric_limits<__int128>. Non-standard libraries may add specializations for library-provided types, e.g. OpenEXR provides std::numeric_limits for a 16-bit floating-point typ

shangz-ai · 2024-08-28T17:15:39Z

Hello team,
Can we add overloads of cuda::std::frexp for the extended floating point types? This is needed in the case of pytorch frexp_cuda.
See pytorch/pytorch#133313

bernhardmgruber · 2024-11-13T09:38:16Z

@gevtushenko says that CUB already has some of the relevant values for <limits> that can be used.

Yes, this is cub::Traits from util_type.cuh. We should deprecate and replace it by <cuda/std/type_traits> when it's ready.

miscco · 2024-11-13T09:46:06Z

Yes see #2749 where I started adding more implementations for extended floating point types

fbusato · 2024-11-13T18:11:41Z

Adding extended floating-point to cuda::std::numeric_limits is not straightforward. Many operations are not even constexpr (contrary to the C++ standard)

github-project-automation bot added this to CCCL Oct 5, 2023

jrhemstad mentioned this issue Oct 5, 2023

[EPIC] Extended Floating-Point Support #31

Open

11 tasks

github-project-automation bot moved this to Todo in CCCL Oct 5, 2023

jrhemstad assigned griwes Oct 27, 2023

jrhemstad mentioned this issue Nov 1, 2023

Support for std::complex<__half>? NVIDIA/libcudacxx#95

Closed

jrhemstad mentioned this issue Nov 8, 2023

Add cuda::std::numeric_limits<__half> #987

Closed

jrhemstad changed the title ~~Specialize relevant cuda::(std::) types for __half/bfloat16~~ Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 Dec 6, 2023

This was referenced Jan 10, 2025

Extend cuda::std::numeric_limits for __half, __nv_bfloat16, __nv_fp8_e4m3, __nv_fp8_e5m2 and deprecate cub::FpLimits #3349

Open

[EPIC] Track breaking changes for CCCL 3.0 #101

Open

bernhardmgruber linked a pull request Jan 12, 2025 that will close this issue

Implement cuda::std::numeric_limits for __half and __nv_bfloat16 #3361

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8` #525

Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8` #525

jrhemstad commented Oct 5, 2023 •

edited by bernhardmgruber

Loading

srinivasyadav18 commented Oct 21, 2023

jrhemstad commented Oct 27, 2023

srinivasyadav18 commented Oct 27, 2023

srinivasyadav18 commented Oct 27, 2023

jrhemstad commented Oct 27, 2023

jrhemstad commented Dec 6, 2023

ngc92 commented Apr 27, 2024 •

edited

Loading

shangz-ai commented Aug 28, 2024

bernhardmgruber commented Nov 13, 2024

miscco commented Nov 13, 2024

fbusato commented Nov 13, 2024

Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 #525

Specialize relevant cuda::(std::) types for __half/bfloat16/fp8 #525

Comments

jrhemstad commented Oct 5, 2023 • edited by bernhardmgruber Loading

Tasks

srinivasyadav18 commented Oct 21, 2023

jrhemstad commented Oct 27, 2023

srinivasyadav18 commented Oct 27, 2023

srinivasyadav18 commented Oct 27, 2023

jrhemstad commented Oct 27, 2023

jrhemstad commented Dec 6, 2023

ngc92 commented Apr 27, 2024 • edited Loading

shangz-ai commented Aug 28, 2024

bernhardmgruber commented Nov 13, 2024

miscco commented Nov 13, 2024

fbusato commented Nov 13, 2024

Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8` #525

Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8` #525

jrhemstad commented Oct 5, 2023 •

edited by bernhardmgruber

Loading

ngc92 commented Apr 27, 2024 •

edited

Loading