Panzer  Version of the Day
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Public Member Functions | Static Public Member Functions | Private Member Functions | Private Attributes | List of all members
panzer::HP Class Reference

Singleton class for accessing kokkos hierarchical parallelism parameters. More...

#include <Panzer_HierarchicParallelism.hpp>

Public Member Functions

void overrideSizes (const int &team_size, const int &vector_size, const int &fad_vector_size, const bool force_override_safety=false)
 
void resetSizes ()
 Reset the sizes to default. More...
 
template<typename Scalar >
int vectorSize () const
 Returns the vector size. Specialized for AD scalar types. More...
 
void setUseSharedMemory (const bool &use_shared_memory, const bool &fad_use_shared_memory)
 Tell kokkos kernels if they should use shared memory. This is very problem dependent. More...
 
template<typename Scalar >
bool useSharedMemory () const
 
template<typename ScalarT , typename... TeamPolicyProperties>
Kokkos::TeamPolicy
< TeamPolicyProperties...> 
teamPolicy (const int &league_size)
 Returns a TeamPolicy for hierarchic parallelism. More...
 
template<typename ScalarT , typename... TeamPolicyProperties, typename ExecSpace >
Kokkos::TeamPolicy< ExecSpace,
TeamPolicyProperties...> 
teamPolicy (ExecSpace exec_space, const int &league_size)
 Returns a TeamPolicy for hierarchic parallelism using an exec_space instance (for cuda streams). More...
 

Static Public Member Functions

static HPinst ()
 Private ctor. More...
 

Private Member Functions

 HP ()
 Use shared memory kokkos kernels for fad types. More...
 

Private Attributes

bool use_auto_team_size_
 
int team_size_
 If true, the team size is set with Kokkos::AUTO() More...
 
int vector_size_
 User specified team size. More...
 
int fad_vector_size_
 Default vector size for non-AD types. More...
 
bool use_shared_memory_
 FAD vector size. More...
 
bool fad_use_shared_memory_
 Use shared memory kokkos kernels for non-fad types. More...
 

Detailed Description

Singleton class for accessing kokkos hierarchical parallelism parameters.

Definition at line 52 of file Panzer_HierarchicParallelism.hpp.

Constructor & Destructor Documentation

panzer::HP::HP ( )
private

Use shared memory kokkos kernels for fad types.

Definition at line 47 of file Panzer_HierarchicParallelism.cpp.

Member Function Documentation

HP & panzer::HP::inst ( )
static

Private ctor.

Return singleton instance of this class.

Definition at line 65 of file Panzer_HierarchicParallelism.cpp.

void panzer::HP::overrideSizes ( const int &  team_size,
const int &  vector_size,
const int &  fad_vector_size,
const bool  force_override_safety = false 
)

Allows the user to override the Kokkos default team and vector sizes for kernel dispatch. The values will be capped by hardware limits and rounded down to the nearest power of two.

The final variable will force the values input to be set explicity and not round down to the nearest power of two or hardware maximum.

Parameters
team_sizeTeam size requested for hierarchic kernel
vector_sizeVector size requested for hierarchic kernel for non-FAD scalar types
fad_vector_sizeVector size requested for hierarchic kernel for FAD scalar types
force_override_safetyIgnore the power of two and other checks

Definition at line 81 of file Panzer_HierarchicParallelism.cpp.

void panzer::HP::resetSizes ( )
inline

Reset the sizes to default.

Definition at line 84 of file Panzer_HierarchicParallelism.hpp.

template<typename Scalar >
int panzer::HP::vectorSize ( ) const
inline

Returns the vector size. Specialized for AD scalar types.

NOTE: For hierarchic parallelism, if we use the same code for both Residual and Jacobian (as we do in most evaluators), the loop over vector level is missing for Residual. The loop is implemented internally in the AD types for Jacobian where on CUDA the warp parallelizes over the derivative dimension. To prevent incorrect code, we need to force the vector size to 1 for non-AD scalar types. Eventual workaround is to use SIMD data type with similar hidden vector loop for Residual. In the mean time, this function will set correct vector_size of one.

Definition at line 99 of file Panzer_HierarchicParallelism.hpp.

void panzer::HP::setUseSharedMemory ( const bool &  use_shared_memory,
const bool &  fad_use_shared_memory 
)

Tell kokkos kernels if they should use shared memory. This is very problem dependent.

If a panzer hierarchic kernel can use shared memory to speed the calculation, then it carries a second implementation that takes advantage of shared memory. Shared memory on the GPU is very limited. On some of the example problems, the shared memory runs out if the basis is greated than order 2 on a hex mesh. This is also very dependent on the size of the derivative array. A large derivative array uses up memory much quicker. The default is that for non-fad types, we always enable shared memory. For fad types the default is to disable use of shared memory, but this function can override for specific problems. For example, the adapters-stk/examples/MixedPoission problem can use shared memory for fad types for basis order 2 or less. It will call this function based on the basis order to improve performance.

Definition at line 105 of file Panzer_HierarchicParallelism.cpp.

template<typename Scalar >
bool panzer::HP::useSharedMemory ( ) const
inline

Definition at line 125 of file Panzer_HierarchicParallelism.hpp.

template<typename ScalarT , typename... TeamPolicyProperties>
Kokkos::TeamPolicy<TeamPolicyProperties...> panzer::HP::teamPolicy ( const int &  league_size)
inline

Returns a TeamPolicy for hierarchic parallelism.

Definition at line 132 of file Panzer_HierarchicParallelism.hpp.

template<typename ScalarT , typename... TeamPolicyProperties, typename ExecSpace >
Kokkos::TeamPolicy<ExecSpace, TeamPolicyProperties...> panzer::HP::teamPolicy ( ExecSpace  exec_space,
const int &  league_size 
)
inline

Returns a TeamPolicy for hierarchic parallelism using an exec_space instance (for cuda streams).

Definition at line 145 of file Panzer_HierarchicParallelism.hpp.

Member Data Documentation

bool panzer::HP::use_auto_team_size_
private

Definition at line 53 of file Panzer_HierarchicParallelism.hpp.

int panzer::HP::team_size_
private

If true, the team size is set with Kokkos::AUTO()

Definition at line 54 of file Panzer_HierarchicParallelism.hpp.

int panzer::HP::vector_size_
private

User specified team size.

Definition at line 55 of file Panzer_HierarchicParallelism.hpp.

int panzer::HP::fad_vector_size_
private

Default vector size for non-AD types.

Definition at line 56 of file Panzer_HierarchicParallelism.hpp.

bool panzer::HP::use_shared_memory_
private

FAD vector size.

Definition at line 57 of file Panzer_HierarchicParallelism.hpp.

bool panzer::HP::fad_use_shared_memory_
private

Use shared memory kokkos kernels for non-fad types.

Definition at line 58 of file Panzer_HierarchicParallelism.hpp.


The documentation for this class was generated from the following files: