0.4.1: support for half precision
In this release:
- We pass the
$CUDA_PATH/include
path to the nvrtc compiler; otherwise e.g.#include <cuda_fp16.h>
will not work. The user could already be doing this, but since we monitor the installation via conf-cuda, it's better to prepend the option automatically. - We work around
ctypes
not supporting theFloat16
type.