Feature/openmp #47

fluidnumerics-joe · 2024-06-13T14:11:06Z

This PR brings in GPU support via OpenMP. This removes the required dependency on HIP/HIPFort. HIPFort has proven challenging to set up with a variety of Fortran compilers which increases the risk of users entering "dependency hell".

OpenMP offloading allows users to simply build CPU only versions quite easily. If GPU offloading (or multi-core CPU offloading!!) is desired, user's will need to use a compiler that supports OpenMP 4.5 for GPU offloading, these include

ROCm amdflang
LLVM flang
Nvidia-HPC nvfortran
Intel OneAPI (ifx)
Cray xlf

A complete list of compilers can be found here https://www.openmp.org/resources/openmp-compilers-tools/#compilers

This implementation also cuts out a significant amount of code in comparison to the HIPFort version (with or without managed memory). This makes it more tractable, IMO, to focus on adding in additional pre-canned models alongside CLI and options configuration within this repository.

Another side benefit, users who want to build there own models can simply concretize flux, riemann solver, and source terms as type bound procedures of extensions of the various model base types (written strictly in fortran), without having to worry about host/device memory management. OpenMP is fairly straight forward to offload those new models to GPUs if desired.

To hook in BLAS on GPUs, we can use the omp use device pointer directive to integrate with other packages like hipblas or magma. This would likely come in a future PR that focuses on performance.

Main outcomes

Reduce dependencies
Simplify install process for users
Simplify GPU memory and kernel management
Reduce LOC

The managed memory interfaces are quite a few lines of code. I'd rather have that sit in an external dependency. We need to have a more robust build setup in Cmake to detect the correct includes and libs for hipfort, but for now, we are ok on noether

There is a pattern we can exploit for derivative, gridinterp, and boundaryinterp in 1-D

We're now using the hipblas implementation of the tensor divergence Next, we need to sort out the boundary terms for dg-divergence of a tensor so that we can do dg gradients

Fixes here means fixes to blatantly wrong openmp directives (e.g. missing `omp end target` or misspelled variables in map clauses) Multicore and GPU offloading currently fails GPU offloading fails during build-time with amdflang for gfx90a Multicore offloading fails at runtime in the first target region of every test with a segmentation fault when using amdflang Multicore offloading fails with gnu compilers (13.2.0) during build time

…egion

…eature/openmp

We can now build executables for nvidia gpus. Still need to do tests :)

fluidnumerics-joe · 2024-06-18T01:04:21Z

Currently waiting on spack/spack#44737 to merge for feq-parse/2.2.2

Change derivative and gridinterp methods to pure functions. This is meant to improve performance

Set gradient and gridinterp operations to pure functions

Set gradient, divergence, and gridinterp operations to pure functions

Set gradient and gridinterp operations to pure functions

Add curl invariant form of the contravariant basis vectors for 3-D

NVFORTRAN-S-0155 - PURE subprograms may not contain OpenMP directives

loop construct allows us to not worry about the different directives required for multicore and gpu architectures. See https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html#openmp-loop

Build test is only done here. Run tests will be done via superci on Fluid Numerics armory systems

…eature/openmp

valgrind check is currently failing within libevent, installed by spack, while all other tests are fine.

fluidnumerics-joe added 30 commits February 3, 2024 01:25

Bring back hipfort as a dependency

705712d

The managed memory interfaces are quite a few lines of code. I'd rather have that sit in an external dependency. We need to have a more robust build setup in Cmake to detect the correct includes and libs for hipfort, but for now, we are ok on noether

Start process of using managed memory (Scalar1D)

705bce8

Add derivative test 1d

5ae8908

Add 2d grid interp test

9f61b82

Add boundaryinterp1d

cf67a5a

Consolidate matrix operations in 1-D

0e29270

There is a pattern we can exploit for derivative, gridinterp, and boundaryinterp in 1-D

Consolidate 2d hipblas matrix ops

6b214c8

Add ScalarGradient_2D with hipblas backend

69835c8

Updates for ROCm 6.0.2

a335240

Fix dependency on hipfort

50d22ff

Add back in tests

488dd32

Add scalar boundary interp (2d); split tests into individual programs

32b2233

Add vector boundary interp

0d7b6a0

Add scalar grid interp (3D) method

d887c24

Formatting clean up

7718310

Add 3D vector divergence with hipblas

820fb72

Add Scalar Gradient 3D with hipblas

a394b3a

Add Scalar 3D boundary interpolation

a545371

Add Vector Boundary interpolation 3D

b5bfc5f

[WIP] Bring back in Mesh module

08b7bfc

Fix issues with mesh setup tests - all working here now!

801a93e

Add MappedScalar 1D tests

d1c0aa2

Add DG and BR derivative 1D tests

367b366

Add mapped scalar gradient calculation

710ccc8

We're now using the hipblas implementation of the tensor divergence Next, we need to sort out the boundary terms for dg-divergence of a tensor so that we can do dg gradients

Add division by jacobian for mapped gradient

e4ee659

Add BR Gradient

75f38e4

Add 3D vector and scalar tests

02ca300

Add DG models back

a25035d

Add advection-diffusion models back into build

6936700

Fix bugs in advection_diffusion models

c2f85a6

fluidnumerics-joe added 6 commits June 17, 2024 18:51

Add OpenMP offloading flag support

1692412

Add helper routine to get openmp flags

f825a98

Fix nvfortran compile issue - can't define multiple teams in target r…

4ebaeb8

…egion

Merge branch 'feature/openmp' of github.com:FluidNumerics/SELF into f…

d333d76

…eature/openmp

Fix openmp directives based on nvfortran compiler feedback

e855a3e

We can now build executables for nvidia gpus. Still need to do tests :)

fluidnumerics-joe added 22 commits June 18, 2024 02:01

Move interp,derivative methods directly into data class (1d)

9ad6f15

Change derivative and gridinterp methods to pure functions. This is meant to improve performance

Fix feqparse function definition for initial condition

10fa551

Push Scalar2d methods out of Lagrange and into data/mappeddata

f50f2b7

Set gradient and gridinterp operations to pure functions

Push Vector2d methods out of Lagrange and into data/mappeddata

05d2d32

Set gradient, divergence, and gridinterp operations to pure functions

Push Scalar3d methods out of Lagrange and into data/mappeddata

95f5f36

Set gradient and gridinterp operations to pure functions

Push 3D operations into data/mappeddata classes

3711cf8

Add curl invariant form of the contravariant basis vectors for 3-D

Remove pure attributes from functions/subroutines

e59971a

NVFORTRAN-S-0155 - PURE subprograms may not contain OpenMP directives

Fix openmp build errors with nvfortran; change to omp loop

443c848

loop construct allows us to not worry about the different directives required for multicore and gpu architectures. See https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html#openmp-loop

Add nvfortran openmp gpu offloading build

0013bc1

Build test is only done here. Run tests will be done via superci on Fluid Numerics armory systems

Merge branch 'feature/openmp' of github.com:FluidNumerics/SELF into f…

4874cb5

…eature/openmp

Add multicore cpu build and run test with nvfortran (gha)

8794337

Fix formatting

6ce5810

Change target to generic gpu

36b58af

Disable valgrind

3b13a7c

valgrind check is currently failing within libevent, installed by spack, while all other tests are fine.

Fix flip calculation for 3-D mesh

820ea64

Fix correctness errors in multicore/multithread implementation

26dde61

Enable non-zero diffusion in tests

08b0282

Add maxerr print when failed

bbb5d11

Switch to loop construct for gpu and multicore offloading

71512d6

Remove comments

d0d81f2

Fix formatting

a467da9

Update CI to include openmp multicore test

ea1e960

garrettbyrd merged commit ea1e960 into main Sep 3, 2024
8 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/openmp #47

Feature/openmp #47

fluidnumerics-joe commented Jun 13, 2024

fluidnumerics-joe commented Jun 18, 2024

Feature/openmp #47

Feature/openmp #47

Conversation

fluidnumerics-joe commented Jun 13, 2024

fluidnumerics-joe commented Jun 18, 2024