Skip to content

Pressure Solver

The pressure solver is the most communication-sensitive part of the multi-GPU branch. The current implementation keeps the pressure equation and zero-mode handling intact while moving the dominant work to GPU kernels, cuFFT, and a pressure-specific transpose-Thomas path.

Primary Files

File Role
press_stag_array.f90 Pressure RHS, pressure halos, cuFFT orchestration
tridag_array.f90 Tridiagonal solve and pressure transpose-Thomas helper
mpi_transpose_mod.f90 MPI transpose support

Implemented GPU Changes

Area Change
Forward/inverse FFT cuFFT batched plans
RHS assembly GPU kernels with combined RHS halo path
Tridiagonal solve GPU Thomas path with cached coefficients
Multi-GPU pressure nproc==2 specialized pressure transpose-Thomas helper
Output path Direct Thomas output avoids a separate pack-out stage where possible
Timing Clean production timing separated from detailed diagnostic timing

Retained Controls

Switch Purpose
LESGO_PRESS_RHS_HALO_COMBINED Fallback-safe control for combined RHS halo
LESGO_PRESS_TRANSPOSE_GENERIC Force old generic transpose helper
LESGO_PRESS_DIRECT_THOMAS_OUT Control direct Thomas output path
LESGO_PRESS_STAGE_TIMING Enable pressure stage timing
LESGO_PRESS_TRANSPOSE_TIMING Enable transpose helper timing

Do not replace Thomas with PCR, CR, or SPIKE-style solvers unless that is a deliberate new algorithmic project. The current production goal is to preserve the original Thomas solve and optimize ownership/layout around it.