SPOOKY - GPU pseudo-spectral code

From 2023 to ongoing | 3 min read

In this pet-project I taught myself CUDA programming by porting an existing pseudo-spectral code (Snoopy, which I used extensively in my PhD) to GPUs. I called this code Spooky, because it’s spectral and also spookily fast.

             ____________
           --            --
         /                  \\
        /                    \\
       /     __               \\
      |     /  \       __      ||
      |    |    |     /  \     ||
            \__/      \__/
     |             ^            ||
     |                          ||
     |                          ||
    |                            ||
    |                            ||
    |                            ||
     \__         ______       __//
        \       //     \_____//
         \_____//

What started off as a simple comparison of the CUDA routines for Fourier transforms on GPUs and the MPI/OpenMP accelerated routines on the CPU nodes that I had available at the time, later became a fully-fledged scientific code in C++ suitable for simulations of incompressible hydro- and magnetohydrodynamic turbulence in simple periodic boxes. The design of the code is modular, and the physics includes vertical stratification with the Boussinesq approximation, and shearing flows (suitable for simulations of Couette flows or astrophysical discs).

Spooky is currently written to run on 1 GPU only, since the idea behind this project was (a) to learn about CUDA programming with something hands-on, and (b) to speed-up the parameter studies of small problems, that fit on the memory of one GPU (for example with resolution going up to $512^3$). You can currently find Spooky here.

Vertical velocity at saturation of a 3D MTI simulation.
Vertical velocity at saturation of a 3D MTI simulation with resolution $512^3$ run with Spooky.

Spooky currently supports the following physics modules:

  • heat equation
  • incompressible (magneto-)hydrodynamics with Laplacian viscosity/resistivity
  • background density stratification through the Boussinesq approximation
  • rotation and shear through the shearing-box model
  • anisotropic heat conduction

The numerical routines available in Spooky include:

  • fast Fourier transforms with cuFFT and custom CUDA kernels
  • low-storage 3rd order Runge-Kutte timestepping (as well as forward Euler for testing)
  • supertimestepping based on Runge-Kutta-Legendre polynomials to speed up the integration of diffusive parabolic terms (1st and 2nd order)
  • simple i/o and set-up of initial conditions through HDF5 library
  • timers for easier benchmarking included in the code
  • user-friendly compilation through CMake (automatic install of needed libraries) and extensive testing benchmarks to validate different parts of the code
  • NEW! Containerization of all dependencies through Apptainer

The code currently includes various tests (Taylor-Green vortex, advected MHD vortex, Alfvén wave propagation, shearing wave, MRI and MTI eigenmodes, …), and benchmarks to verify the scaling of the supertimestepping.

Scaling of the different supertimestepping algorithms with the grid size in a 1D heat diffusion problem
Scaling of the accuracy (L1-norm) for different supertimestepping algorithms with the grid size in a 1D heat diffusion problem.

Prerequisites

The current implementation of SPOOKY requires:

  1. a CUDA compiler (tested with cuda-11.8 and cuda-12.0)
  2. CUDA toolkit
  3. cmake (minimum 3.24)
  4. Python 3.+ with numpy, matplotlib, argparse (necessary for some tests)
  5. libconfig and HDF5 libraries (can be installed automatically if not present)

Installation

Refer to the README.md in the repo for further details.

git clone git@github.com:LorenzoLMP/spooky-git.git
cd spooky-git

Compiling with cmake

Create build directory if not already present for out-of-source build (recommended)

$ mkdir build
$ cd build

A typical build command looks like this:

$ cmake -DBUILD_TESTS=ON -DCMAKE_CUDA_COMPILER=/path/to/cuda/bin/nvcc -DHDF5_ROOT=/path/to/hdf5/ -DLIBCONFIG_ROOT=/path/to/libconfig/ -DCMAKE_CUDA_ARCHITECTURES="XX" ..
  1. The cuda architectures have to be chosen based on the hardware that is available. 75 for NVIDIA Quadro RTX 8000, 80 for A100.
  2. Depending on the version of your default g++ compiler, it might be necessary to add the option -DCMAKE_CUDA_FLAGS="-ccbin /path/to/g++" with the path to a compatible version of g++
  3. If you don’t want to build the tests, simply do -DBUILD_TESTS=OFF or omit.
  4. If you don’t have libconfig or hdf5 installed, omit the option -DLIBCONFIG_ROOT or -DHDF5_ROOT and CMake will attempt to automatically donwload and build the appropriate version of the libraries.

If the configuration step was successful, now simply compile as:

$ make clean && make -j 8

The SPOOKY executable can be run as

$ ./src/spooky --input-dir /path/to/input/dir

Running tests

If you want to run the tests (-DBUILD_TESTS=ON) do instead:

$ ctest -V -R "spooky" -E "sts"

which will run all the spooky tests and show the output.