-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/openmp #47
Merged
Merged
Feature/openmp #47
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The managed memory interfaces are quite a few lines of code. I'd rather have that sit in an external dependency. We need to have a more robust build setup in Cmake to detect the correct includes and libs for hipfort, but for now, we are ok on noether
There is a pattern we can exploit for derivative, gridinterp, and boundaryinterp in 1-D
We're now using the hipblas implementation of the tensor divergence Next, we need to sort out the boundary terms for dg-divergence of a tensor so that we can do dg gradients
Fixes here means fixes to blatantly wrong openmp directives (e.g. missing `omp end target` or misspelled variables in map clauses) Multicore and GPU offloading currently fails GPU offloading fails during build-time with amdflang for gfx90a Multicore offloading fails at runtime in the first target region of every test with a segmentation fault when using amdflang Multicore offloading fails with gnu compilers (13.2.0) during build time
We can now build executables for nvidia gpus. Still need to do tests :)
Currently waiting on spack/spack#44737 to merge for |
Change derivative and gridinterp methods to pure functions. This is meant to improve performance
Set gradient and gridinterp operations to pure functions
Set gradient, divergence, and gridinterp operations to pure functions
Set gradient and gridinterp operations to pure functions
Add curl invariant form of the contravariant basis vectors for 3-D
NVFORTRAN-S-0155 - PURE subprograms may not contain OpenMP directives
loop construct allows us to not worry about the different directives required for multicore and gpu architectures. See https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html#openmp-loop
Build test is only done here. Run tests will be done via superci on Fluid Numerics armory systems
valgrind check is currently failing within libevent, installed by spack, while all other tests are fine.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR brings in GPU support via OpenMP. This removes the required dependency on HIP/HIPFort. HIPFort has proven challenging to set up with a variety of Fortran compilers which increases the risk of users entering "dependency hell".
OpenMP offloading allows users to simply build CPU only versions quite easily. If GPU offloading (or multi-core CPU offloading!!) is desired, user's will need to use a compiler that supports OpenMP 4.5 for GPU offloading, these include
amdflang
flang
nvfortran
ifx
)xlf
A complete list of compilers can be found here https://www.openmp.org/resources/openmp-compilers-tools/#compilers
This implementation also cuts out a significant amount of code in comparison to the HIPFort version (with or without managed memory). This makes it more tractable, IMO, to focus on adding in additional pre-canned models alongside CLI and options configuration within this repository.
Another side benefit, users who want to build there own models can simply concretize flux, riemann solver, and source terms as type bound procedures of extensions of the various model base types (written strictly in fortran), without having to worry about host/device memory management. OpenMP is fairly straight forward to offload those new models to GPUs if desired.
To hook in BLAS on GPUs, we can use the
omp use device pointer
directive to integrate with other packages like hipblas or magma. This would likely come in a future PR that focuses on performance.Main outcomes