Skip to content

0.4.1: support for half precision

Compare
Choose a tag to compare
@lukstafi lukstafi released this 12 Sep 10:10
· 49 commits to main since this release

In this release:

  • We pass the $CUDA_PATH/include path to the nvrtc compiler; otherwise e.g. #include <cuda_fp16.h> will not work. The user could already be doing this, but since we monitor the installation via conf-cuda, it's better to prepend the option automatically.
  • We work around ctypes not supporting the Float16 type.