Space Shuttle 10 billion voxel CFD on 8x 64GB GPUs

FluidX3D source code: Now I get why the Space Shuttle sometimes was also called a “flying brick“. This is a 10 billion voxel lattice Boltzmann CFD simulation on 4x AMD Instinct MI250 (8x MI200 GCD with 64GB VRAM each). Simulating 108k time steps on the 1608×4824×1280 resolution grid took 6 hours, plus 36 minutes for rendering 2x 30s 4K60 video. Shown is the Q-criterion isosurfaces with marching-cubes. Reynolds number is 1 Million with Smagorinsky-Lilly subgrid model. Grid resolution here is about 60x bigger than the largest Space Shuttle CFD simulation ever done by NASA. How is it possible to squeeze 10 billion grid points in only 512GB? I’m using two techniques here, which together form the holy grail of lattice Boltzmann, cutting memory demand down to only 55 Bytes/node for D3Q19 LBM, or 1/3 of conventional codes: 1. In-place streaming with Esoteric-Pull. This almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries. Paper: 2. Decoupled arithmetic precision (FP32) and memory precision (FP16): all arithmetic is done in FP32, but LBM density distribution functions in memory are compressed to FP16. This almost cuts memory demand in half and almost doubles performance, without impacting overall accuracy for most setups. Paper: Graphics are done directly in FluidX3D with OpenCL, with the raw simulation data already residing in ultra-fast video memory. No volumetric data (1 frame of the velocity field is 14GB!) ever has to be copied to the CPU or hard drive, but only rendered 1080p frames (8MB) instead. Once on the CPU side, a copy of the frame is made in memory and a thread is detached to handle the slow .png compression, all while the simulation is already continuing. At any time, about 16 frames are compressed in parallel on 16 CPU cores, while the simulation is running on GPU. Paper: Timestamps: 0:00 bottom view 0:30 top view Thanks to the people at Jülich Supercomputing Centre for letting me test their hardware! #CFD #GPU #FluidX3D #OpenCL
Back to Top