diff --git a/docs/api_reference.rst b/docs/api_reference.rst index 95826691..c684bcae 100644 --- a/docs/api_reference.rst +++ b/docs/api_reference.rst @@ -7,7 +7,7 @@ API Reference ============= -This section documents the public user interface of ``Kokkos-fft``. +This section documents the public user interface of ``kokkos-fft``. APIs are defined in ``KokkosFFT`` namespace and implementation details are defined in ``KokkosFFT::Impl`` namespace. Thus, it is highly discouraged for users to access functions in ``KokkosFFT::Impl`` namespace. Except for ``KokkosFFT::Plan``, there are corresponding functions in ``numpy.fft`` as shown below. @@ -26,7 +26,7 @@ FFT Plan :header-rows: 1 * - Description - - ``KokkosFFT`` + - ``kokkos-fft`` - ``numpy.fft`` * - A class that manages a FFT plan of backend FFT library - :doc:`api/plan` @@ -51,7 +51,7 @@ Standard FFTs :header-rows: 1 * - Description - - ``KokkosFFT`` + - ``kokkos-fft`` - ``numpy.fft`` * - One dimensional FFT in forward direction - :doc:`api/standard/fft` @@ -91,7 +91,7 @@ Real FFTs :header-rows: 1 * - Description - - ``KokkosFFT`` + - ``kokkos-fft`` - ``numpy.fft`` * - One dimensional FFT for real input - :doc:`api/real/rfft` @@ -128,7 +128,7 @@ Hermitian FFTs :header-rows: 1 * - Description - - ``KokkosFFT`` + - ``kokkos-fft`` - ``numpy.fft`` * - One dimensional FFT of a signal that has Hermitian symmetry - :doc:`api/hermitian/hfft` @@ -154,7 +154,7 @@ Helper routines :header-rows: 1 * - Description - - ``KokkosFFT`` + - ``kokkos-fft`` - ``numpy.fft`` * - Return the DFT sample frequencies - :doc:`api/helper/fftfreq` diff --git a/docs/examples.rst b/docs/examples.rst index 12a032f4..ae834ca2 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -9,7 +9,7 @@ Examples There are some `examples `_ in the -Kokkos-fft repository. Most of the examples include Kokkos and numpy implementations. +kokkos-fft repository. Most of the examples include Kokkos and numpy implementations. For example, `01_1DFFT `_ includes, @@ -19,7 +19,7 @@ For example, `01_1DFFT | └──01_1DFFT/ |--CMakeLists.txt - |--01_1DFFT.cpp (Kokkos-fft version) + |--01_1DFFT.cpp (kokkos-fft version) └──numpy_1DFFT.py (numpy version) Please find the examples from following links. @@ -33,4 +33,5 @@ Please find the examples from following links. samples/04_batchedFFT.rst samples/05_1DFFT_HOST_DEVICE.rst samples/06_1DFFT_reuse_plans.rst - samples/07_unmanaged_views.rst \ No newline at end of file + samples/07_unmanaged_views.rst + samples/08_inplace_FFT.rst \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 8a3b456a..718033b4 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -2,10 +2,10 @@ .. .. SPDX-License-Identifier: MIT OR Apache-2.0 WITH LLVM-exception -Kokkos-fft documentation +kokkos-fft documentation ======================================= -Kokkos-fft implements local interfaces between `Kokkos `_ +kokkos-fft implements local interfaces between `Kokkos `_ and de facto standard FFT libraries, including `fftw `_, `cufft `_, @@ -15,9 +15,9 @@ We are inclined to implement the `numpy.fft `_. +kokkos-fft is open source and available on `GitHub `_. -Here is an example for 1D real to complex transform with ``rfft`` in Kokkos-fft. +Here is an example for 1D real to complex transform with ``rfft`` in kokkos-fft. .. code-block:: C++ diff --git a/docs/intro/building.rst b/docs/intro/building.rst index 1563a475..552e9c20 100644 --- a/docs/intro/building.rst +++ b/docs/intro/building.rst @@ -4,18 +4,18 @@ .. _building: -Building Kokkos-fft +Building kokkos-fft =================== -This section describes how to build Kokkos-fft with some advanced options. -In order to build Kokkos-fft, we use ``CMake`` with following compilers. +This section describes how to build kokkos-fft with some advanced options. +In order to build kokkos-fft, we use ``CMake`` with following compilers. Kokkos and backend FFT libraries are also necessary. -Available CMake options for Kokkos-fft are listed. +Available CMake options for kokkos-fft are listed. Compiler versions ----------------- -Kokkos-fft relies on quite basic functionalities of Kokkos, and thus it is supposed to work with compilers used for `Kokkos `_. +kokkos-fft relies on quite basic functionalities of Kokkos, and thus it is supposed to work with compilers used for `Kokkos `_. However, we have not tested all the listed compilers there and thus recommend the following compilers which we use frequently for testing. * ``gcc 8.3.0+`` - CPUs @@ -23,10 +23,10 @@ However, we have not tested all the listed compilers there and thus recommend th * ``nvcc 11.0.0+`` - NVIDIA GPUs * ``rocm 5.3.0+`` - AMD GPUs -Install Kokkos-fft as a library +Install kokkos-fft as a library ------------------------------- -Let's assume Kokkos is installed under ```` with ``OpenMP`` backend. We build and install Kokkos-fft under ````. +Let's assume Kokkos is installed under ```` with ``OpenMP`` backend. We build and install kokkos-fft under ````. .. code-block:: bash @@ -39,7 +39,7 @@ Let's assume Kokkos is installed under ```` with ``OpenMP`` back cmake --build build_KokkosFFT -j 8 cmake --install build_KokkosFFT -Here is an example to use Kokkos-fft in the following CMake project. +Here is an example to use kokkos-fft in the following CMake project. .. code-block:: bash @@ -74,9 +74,9 @@ The code can be built as CMake options ------------- -We rely on CMake to build Kokkos-fft, more specifically ``CMake 3.22+``. Here is the list of CMake options. +We rely on CMake to build kokkos-fft, more specifically ``CMake 3.22+``. Here is the list of CMake options. For FFTs on Kokkos device only, we do not need to add extra compile options but for Kokkos ones. -In order to use Kokkos-fft from both host and device, it is necessary to add ``KokkosFFT_ENABLE_HOST_AND_DEVICE=ON``. +In order to use kokkos-fft from both host and device, it is necessary to add ``KokkosFFT_ENABLE_HOST_AND_DEVICE=ON``. This option may be useful, for example FFT is used for initialization at host. However, to enable this option, we need a pre-installed ``fftw`` for FFT on host, so it is disabled in default (see :doc:`minimum working example<../samples/05_1DFFT_HOST_DEVICE>`). @@ -95,13 +95,13 @@ However, to enable this option, we need a pre-installed ``fftw`` for FFT on host - Build internal Kokkos instead of relying on external one. - OFF * - ``KokkosFFT_ENABLE_EXAMPLES`` - - Build Kokkos-fft examples + - Build kokkos-fft examples - OFF * - ``KokkosFFT_ENABLE_TESTS`` - - Build Kokkos-fft tests + - Build kokkos-fft tests - OFF * - ``KokkosFFT_ENABLE_BENCHMARK`` - - Build benchmarks for Kokkos-fft + - Build benchmarks for kokkos-fft - OFF * - ``KokkosFFT_ENABLE_ROCFFT`` - Use `rocfft `_ for HIP backend @@ -110,7 +110,7 @@ However, to enable this option, we need a pre-installed ``fftw`` for FFT on host Kokkos backends --------------- -Kokkos-fft requires ``Kokkos 4.2+``. For the moment, we support following backends for CPUs and GPUs. +kokkos-fft requires ``Kokkos 4.4+``. For the moment, we support following backends for CPUs and GPUs. A FFT library dedicated to Kokkos Device backend (e.g. cufft for CUDA backend) is automatically used. If CMake fails to find a backend FFT library, see :doc:`How to find fft libraries?<../finding_libraries>`. We may support experimental backends like ``OPENMPTARGET`` in the future. diff --git a/docs/intro/quick_start.rst b/docs/intro/quick_start.rst index 2b642e37..0c079293 100644 --- a/docs/intro/quick_start.rst +++ b/docs/intro/quick_start.rst @@ -7,20 +7,20 @@ Quickstart guide ================ -This section will quickly illustrate how to use Kokkos-fft. +This section will quickly illustrate how to use kokkos-fft. First of all, you need to clone this repo. .. code-block:: bash git clone --recursive https://github.com/kokkos/kokkos-fft.git -To configure Kokkos-fft, we can just use CMake options for Kokkos, which automatically enables the FFT interface on Kokkos device. +To configure kokkos-fft, we can just use CMake options for Kokkos, which automatically enables the FFT interface on Kokkos device. If CMake fails to find a backend FFT library, see :doc:`How to find fft libraries?<../finding_libraries>`. Requirements ------------ -Kokkos-fft requires ``Kokkos 4.4+`` and dedicated compilers for CPUs or GPUs. +kokkos-fft requires ``Kokkos 4.4+`` and dedicated compilers for CPUs or GPUs. It employs ``CMake 3.22+`` for building. Here are list of compilers we frequently use for testing. @@ -33,11 +33,11 @@ Here are list of compilers we frequently use for testing. Building -------- -For the moment, there are two ways to use Kokkos-fft: including as a subdirectory in CMake project or installing as a library. -For simplicity, however, we demonstrate an example to use Kokkos-fft as a subdirectory in a CMake project. For installation, see :ref:`Building Kokkos-fft`. -Since Kokkos-fft is a header-only library, it is enough to simply add as a subdirectory. It is assumed that kokkos and Kokkos-fft are placed under ``/tpls``. +For the moment, there are two ways to use kokkos-fft: including as a subdirectory in CMake project or installing as a library. +For simplicity, however, we demonstrate an example to use kokkos-fft as a subdirectory in a CMake project. For installation, see :ref:`Building kokkos-fft`. +Since kokkos-fft is a header-only library, it is enough to simply add as a subdirectory. It is assumed that kokkos and kokkos-fft are placed under ``/tpls``. -Here is an example to use Kokkos-fft in the following CMake project. +Here is an example to use kokkos-fft in the following CMake project. .. code-block:: bash @@ -80,7 +80,7 @@ Trying ------ For those who are familiar with `numpy.fft `_, -you may use Kokkos-fft quite easily. Here is an example for 1D real to complex transform with ``rfft`` in Kokkos-fft and python. +you may use kokkos-fft quite easily. Here is an example for 1D real to complex transform with ``rfft`` in kokkos-fft and python. .. code-block:: C++ @@ -90,7 +90,7 @@ you may use Kokkos-fft quite easily. Here is an example for 1D real to complex t #include using execution_space = Kokkos::DefaultExecutionSpace; template using View1D = Kokkos::View; - constexpr int n = 4; + const int n = 4; View1D x("x", n); View1D > x_hat("x_hat", n/2+1); @@ -112,4 +112,4 @@ There are two major differences: ``execution_space`` argument and output value ( Instead of numpy.array, we rely on `Kokkos Views `_. The accessibilities of Views from ``execution_space`` are statically checked (compilation errors if not accessible). It is easiest to rely only on the ``Kokkos::DefaultExecutionSpace`` for both View allocation and KokkosFFT APIs. -See :ref:`Using Kokkos-fft` for detail. +See :ref:`Using kokkos-fft` for detail. diff --git a/docs/intro/using.rst b/docs/intro/using.rst index d26d5095..31fa6a02 100644 --- a/docs/intro/using.rst +++ b/docs/intro/using.rst @@ -4,17 +4,17 @@ .. _using: -Using Kokkos-fft +Using kokkos-fft ================ -This section describes how to use Kokkos-fft in practice. +This section describes how to use kokkos-fft in practice. We also explain some tips to use it efficiently. Brief introduction ------------------ -Most of the numpy.fft APIs (``numpy.fft.``) are available in Kokkos-fft (``KokkosFFT::``) on the Kokkos device. -In fact, these are the only APIs available in Kokkos-fft (see :doc:`API reference<../api_reference>` for detail). Kokkos-fft support 1D to 3D FFT over chosen axes. +Most of the numpy.fft APIs (``numpy.fft.``) are available in kokkos-fft (``KokkosFFT::``) on the Kokkos device. +In fact, these are the only APIs available in kokkos-fft (see :doc:`API reference<../api_reference>` for detail). kokkos-fft support 1D to 3D FFT over chosen axes. Inside FFT APIs, we first create a FFT plan for a backend FFT library based on the Views and chosen axes. Then, we execute the FFT using the created plan on the given Views. Then, we may perform normalization based on the users' choice. Finally, we destroy the plan. Depending on the View Layout and chosen axes, we may need transpose operations to make data contiguous. @@ -40,7 +40,7 @@ The following listing shows good and bad examples of Real FFTs. .. code-block:: C++ template using View2D = Kokkos::View; - constexpr int n0 = 4, n1 = 8; + const int n0 = 4, n1 = 8; View2D x("x", n0, n1); View2D > x_hat_good("x_hat_good", n0, n1/2+1); @@ -98,7 +98,7 @@ Memory consmpution ------------------ In order to support FFT over arbitral axes, -Kokkos-fft performs transpose operations internally and apply FFT on contiguous data. +kokkos-fft performs transpose operations internally and apply FFT on contiguous data. For size ``n`` input, this requires internal buffers of size ``2n`` in addition to the buffers used by FFT library. Performance overhead from transpose may be not critical but memory consumptions are problematic. If memory consumption matters, it is recommended to make data contiguous so that transpose is not performed. @@ -107,7 +107,7 @@ The following listing shows examples with and without transpose operation. .. code-block:: C++ template using View2D = Kokkos::View; - constexpr int n0 = 4, n1 = 8; + const int n0 = 4, n1 = 8; View2D x("x", n0, n1); View2D > x_hat_good("x_hat_good", n0/2+1, n1); @@ -122,10 +122,34 @@ The following listing shows examples with and without transpose operation. Reuse FFT plan -------------- -Apart from the basic APIs, Kokkos-fft offers the capability to create a FFT plan wrapping the FFT plans of backend libraries. +Apart from the basic APIs, kokkos-fft offers the capability to create a FFT plan wrapping the FFT plans of backend libraries. We can reuse the FFT plan created once to perform FFTs multiple times on different data given that they have the same properties. In some backend, FFT plan creation leads to some overhead, wherein we need this functionality. (see :doc:`minimum working example<../samples/06_1DFFT_reuse_plans>`) +The following listing shows an example to reuse the FFT plan. + +.. code-block:: C++ + + template using View2D = Kokkos::View; + const int n0 = 4, n1 = 8, n2 = 5, n3 = 10; + + View2D > x("x", n0, n1), x_hat("x_hat", n0, n1); + View2D > y("y", n0, n1), y_hat("y_hat", n0, n1); + View2D > z("z", n2, n3), z_hat("z_hat", n2, n3); + + // Create a plan for 1D FFT + int axis = -1; + KokkosFFT::Plan fft_plan(execution_space(), x, x_hat, + KokkosFFT::Direction::forward, axis); + + // Perform FFTs using fft_plan + fft_plan.execute(x, x_hat); + + // [OK] Reuse the plan for different data + fft_plan.execute(y, y_hat); + + // [NG, Run time error] Inconsistent extents + fft_plan.execute(z, z_hat); .. note:: @@ -148,7 +172,7 @@ The following listing shows examples of axes parameters with negative or positiv template using View2D = Kokkos::View; template using View3D = Kokkos::View; - constexpr int n0 = 4, n1 = 8, n2 = 5; + const int n0 = 4, n1 = 8, n2 = 5; View2D x2("x2", n0, n1); View3D x3("x3", n0, n1, n2); @@ -176,3 +200,66 @@ The following listing shows examples of axes parameters with negative or positiv If you rely on negative axes, you can specify last axes no matter what the rank of Views is. However, the corresponding positive axes to last axes are different depending on the rank of Views. Thus, it is recommended to use negative axes for simplicity. + +Inplace transform +----------------- + +Inplace transform is supported in kokkos-fft in case transpose or reshape is not needed. +For standard FFTs, we can just use the same input and output Views. For real FFTs, we need to use a single complex View and make +an unmanaged View which is an alias to the complex View. In addition, we need to pay attention to the extents of a real View, +which should define the shape of the transform, not the reinterpreted shape of the complex View. (see :doc:`minimum working example<../samples/08_inplace_FFT>`) +The following listing shows examples of inplace transforms. + +.. code-block:: C++ + + template using View2D = Kokkos::View; + const int n0 = 4, n1 = 8; + View2D> xc2c("xc2c", n0, n1); + + execution_space exec; + + // For standard inplace FFTs, we just reuse the same views + KokkosFFT::fft2(exec, xc2c, xc2c); + KokkosFFT::ifft2(exec, xc2c, xc2c); + + // Real to complex transform + // Define a 2D complex view to handle data + View2D> xr2c_hat("xr2c", n0, n1 / 2 + 1); + + // Create unmanaged views on the same data with the FFT shape, + // that is (n0, n1) -> (n0, n1/2+1) R2C transform + // The shape is incorrect from the view point of casting to real + // For casting, the shape should be (n0, (n0/2+1) * 2) + View2D xr2c(reinterpret_cast(xr2c_hat.data()), n0, n1); + + // Perform the real to complex transform + // [Important] You must use xr2c to define the FFT shape correctly + KokkosFFT::rfft2(exec, xr2c, xr2c_hat); + + // Complex to real transform + // Define a 2D complex view to handle data + View2D> xc2r("xc2r", n0, n1 / 2 + 1); + + // Create an unmanaged view on the same data with the FFT shape + View2D xc2r_hat(reinterpret_cast(xc2r.data()), n0, n1); + + // Create a plan + using axes_type = KokkosFFT::axis_type<2>; + axes_type axes = {-2, -1}; + KokkosFFT::Plan irfft2_plan(execution_space(), xc2r, xc2r_hat, + KokkosFFT::Direction::backward, axes); + + // Perform the complex to real transform + // [Important] You must use xc2r_hat to define the FFT shape correctly + irfft2_plan.execute(xc2r, xc2r_hat); + + View2D xc2r_hat_out("xc2r_hat_out", n0, n1); + + // [NG, Runtime error] Inplace plan can only be reused for inplace transform + irfft2_plan.execute(xc2r, xc2r_hat_out); + +.. note:: + + You can reuse a plan for inplace transform. However, you cannot reuse a plan + for inplace transform for out-of-place transform and vice versa. + \ No newline at end of file diff --git a/docs/samples/01_1DFFT.rst b/docs/samples/01_1DFFT.rst index 4804b8f1..147d6341 100644 --- a/docs/samples/01_1DFFT.rst +++ b/docs/samples/01_1DFFT.rst @@ -7,8 +7,8 @@ One dimensional FFT =================== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/01_1DFFT/01_1DFFT.cpp :language: C++ diff --git a/docs/samples/02_2DFFT.rst b/docs/samples/02_2DFFT.rst index 2502f160..3d1b2c25 100644 --- a/docs/samples/02_2DFFT.rst +++ b/docs/samples/02_2DFFT.rst @@ -7,8 +7,8 @@ Two dimensional FFT =================== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/02_2DFFT/02_2DFFT.cpp :language: C++ diff --git a/docs/samples/03_NDFFT.rst b/docs/samples/03_NDFFT.rst index 54249aeb..ecc5e2ec 100644 --- a/docs/samples/03_NDFFT.rst +++ b/docs/samples/03_NDFFT.rst @@ -7,8 +7,8 @@ N-dimensional FFT ================= -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/03_NDFFT/03_NDFFT.cpp :language: C++ diff --git a/docs/samples/04_batchedFFT.rst b/docs/samples/04_batchedFFT.rst index c010e6ac..34d7b060 100644 --- a/docs/samples/04_batchedFFT.rst +++ b/docs/samples/04_batchedFFT.rst @@ -7,8 +7,8 @@ One-dimensional batched FFT =========================== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/04_batchedFFT/04_batchedFFT.cpp :language: C++ diff --git a/docs/samples/05_1DFFT_HOST_DEVICE.rst b/docs/samples/05_1DFFT_HOST_DEVICE.rst index 5204948c..9332fdd7 100644 --- a/docs/samples/05_1DFFT_HOST_DEVICE.rst +++ b/docs/samples/05_1DFFT_HOST_DEVICE.rst @@ -7,8 +7,8 @@ FFT on host and device ====================== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/05_1DFFT_HOST_DEVICE/05_1DFFT_HOST_DEVICE.cpp :language: C++ diff --git a/docs/samples/06_1DFFT_reuse_plans.rst b/docs/samples/06_1DFFT_reuse_plans.rst index 86b719a3..a01bc38d 100644 --- a/docs/samples/06_1DFFT_reuse_plans.rst +++ b/docs/samples/06_1DFFT_reuse_plans.rst @@ -7,8 +7,8 @@ Reuse FFT plan ============== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/06_1DFFT_reuse_plans/06_1DFFT_reuse_plans.cpp :language: C++ diff --git a/docs/samples/07_unmanaged_views.rst b/docs/samples/07_unmanaged_views.rst index d9cc81a9..e0e0f4e7 100644 --- a/docs/samples/07_unmanaged_views.rst +++ b/docs/samples/07_unmanaged_views.rst @@ -7,8 +7,8 @@ Using Unmanaged Views ===================== -KokkosFFT ---------- +kokkos-fft +---------- .. literalinclude:: ../../examples/07_unmanaged_views/07_unmanaged_views.cpp :language: C++ diff --git a/docs/samples/08_inplace_FFT.rst b/docs/samples/08_inplace_FFT.rst new file mode 100644 index 00000000..57ff6345 --- /dev/null +++ b/docs/samples/08_inplace_FFT.rst @@ -0,0 +1,14 @@ +.. SPDX-FileCopyrightText: (C) The kokkos-fft development team, see COPYRIGHT.md file +.. +.. SPDX-License-Identifier: MIT OR Apache-2.0 WITH LLVM-exception + +.. _ 08_inplace_FFT: + +Inplace transforms +================== + +kokkos-fft +---------- + +.. literalinclude:: ../../examples/08_inplace_FFT/08_inplace_FFT.cpp + :language: C++