Panzer
Version of the Day
|
Singleton class for accessing kokkos hierarchical parallelism parameters. More...
#include <Panzer_HierarchicParallelism.hpp>
Public Member Functions | |
void | overrideSizes (const int &team_size, const int &vector_size, const int &fad_vector_size) |
Allows the user to override default sizes. More... | |
template<typename Scalar > | |
int | vectorSize () const |
Returns the vector size. Specialized for AD scalar types. More... | |
void | setUseSharedMemory (const bool &use_shared_memory, const bool &fad_use_shared_memory) |
Tell kokkos kernels if they should use shared memory. This is very problem dependent. More... | |
template<typename Scalar > | |
bool | useSharedMemory () const |
template<typename ScalarT , typename... TeamPolicyProperties> | |
Kokkos::TeamPolicy < TeamPolicyProperties...> | teamPolicy (const int &league_size) |
Returns a TeamPolicy for hierarchic parallelism. More... | |
Static Public Member Functions | |
static HP & | inst () |
Private ctor. More... | |
Private Member Functions | |
HP () | |
Use shared memory kokkos kernels for fad types. More... | |
Private Attributes | |
bool | use_auto_team_size_ |
int | team_size_ |
If true, the team size is set with Kokkos::AUTO() More... | |
int | vector_size_ |
User specified team size. More... | |
int | fad_vector_size_ |
Default vector size for non-AD types. More... | |
bool | use_shared_memory_ |
FAD vector size. More... | |
bool | fad_use_shared_memory_ |
Use shared memory kokkos kernels for non-fad types. More... | |
Singleton class for accessing kokkos hierarchical parallelism parameters.
Definition at line 52 of file Panzer_HierarchicParallelism.hpp.
|
private |
Use shared memory kokkos kernels for fad types.
Definition at line 47 of file Panzer_HierarchicParallelism.cpp.
|
static |
Private ctor.
Return singleton instance of this class.
Definition at line 62 of file Panzer_HierarchicParallelism.cpp.
void panzer::HP::overrideSizes | ( | const int & | team_size, |
const int & | vector_size, | ||
const int & | fad_vector_size | ||
) |
Allows the user to override default sizes.
Definition at line 68 of file Panzer_HierarchicParallelism.cpp.
|
inline |
Returns the vector size. Specialized for AD scalar types.
NOTE: For hierarchic parallelism, if we use the same code for both Residual and Jacobian (as we do in most evaluators), the loop over vector level is missing for Residual. The loop is implemented internally in the AD types for Jacobian where on CUDA the warp parallelizes over the derivative dimension. To prevent incorrect code, we need to force the vector size to 1 for non-AD scalar types. Eventual workaround is to use SIMD data type with similar hidden vector loop for Residual. In the mean time, this function will set correct vector_size of one.
Definition at line 84 of file Panzer_HierarchicParallelism.hpp.
void panzer::HP::setUseSharedMemory | ( | const bool & | use_shared_memory, |
const bool & | fad_use_shared_memory | ||
) |
Tell kokkos kernels if they should use shared memory. This is very problem dependent.
If a panzer hierarchic kernel can use shared memory to speed the calculation, then it carries a second implementation that takes advantage of shared memory. Shared memory on the GPU is very limited. On some of the example problems, the shared memory runs out if the basis is greated than order 2 on a hex mesh. This is also very dependent on the size of the derivative array. A large derivative array uses up memory much quicker. The default is that for non-fad types, we always enable shared memory. For fad types the default is to disable use of shared memory, but this function can override for specific problems. For example, the adapters-stk/examples/MixedPoission problem can use shared memory for fad types for basis order 2 or less. It will call this function based on the basis order to improve performance.
Definition at line 78 of file Panzer_HierarchicParallelism.cpp.
|
inline |
Definition at line 110 of file Panzer_HierarchicParallelism.hpp.
|
inline |
Returns a TeamPolicy for hierarchic parallelism.
Definition at line 117 of file Panzer_HierarchicParallelism.hpp.
|
private |
Definition at line 53 of file Panzer_HierarchicParallelism.hpp.
|
private |
If true, the team size is set with Kokkos::AUTO()
Definition at line 54 of file Panzer_HierarchicParallelism.hpp.
|
private |
User specified team size.
Definition at line 55 of file Panzer_HierarchicParallelism.hpp.
|
private |
Default vector size for non-AD types.
Definition at line 56 of file Panzer_HierarchicParallelism.hpp.
|
private |
FAD vector size.
Definition at line 57 of file Panzer_HierarchicParallelism.hpp.
|
private |
Use shared memory kokkos kernels for non-fad types.
Definition at line 58 of file Panzer_HierarchicParallelism.hpp.