Zoltan2
|
A Dragonfly (e.g. Cori, Trinity, & Theta) Machine Class for task mapping. More...
#include <Zoltan2_MachineDragonflyRCA.hpp>
Public Member Functions | |
MachineDragonflyRCA (const Teuchos::Comm< int > &comm) | |
Constructor: Dragonfly (e.g. Cori & Trinity) network machine description;. More... | |
virtual bool | getMachineExtentWrapArounds (bool *wrap_around) const |
MachineDragonflyRCA (const Teuchos::Comm< int > &comm, const Teuchos::ParameterList &pl_) | |
Constructor: Dragonfly (e.g. Cori & Trinity) network machine description;. More... | |
virtual | ~MachineDragonflyRCA () |
bool | hasMachineCoordinates () const |
int | getMachineDim () const |
bool | getTransformedMachineExtent (int *nxyz) const |
bool | getActualMachineExtent (int *nxyz) const |
bool | getMachineExtent (int *nxyz) const |
part_t | getNumUniqueGroups () const override |
getNumUniqueGroups function return the number of unique Dragonfly network groups in provided allocation. More... | |
bool | getGroupCount (part_t *grp_count) const override |
void | printAllocation () |
bool | getMyTransformedMachineCoordinate (pcoord_t *xyz) |
bool | getMyActualMachineCoordinate (pcoord_t *xyz) |
bool | getMyMachineCoordinate (pcoord_t *xyz) |
bool | getMachineCoordinate (const int rank, pcoord_t *xyz) const |
bool | getMachineCoordinate (const char *nodename, pcoord_t *xyz) |
bool | getAllMachineCoordinatesView (pcoord_t **&allCoords) const |
virtual bool | getHopCount (int rank1, int rank2, pcoord_t &hops) const override |
getHopCount function set hops between rank1 and rank2 return true if coordinates are available More... | |
Public Member Functions inherited from Zoltan2::Machine< pcoord_t, part_t > | |
Machine (const Teuchos::Comm< int > &comm) | |
Constructor MachineRepresentation Class. More... | |
virtual | ~Machine () |
bool | hasMachineCoordinates () const |
indicates whether or not the machine has coordinates More... | |
int | getMachineDim () const |
returns the dimension (number of coords per node) in the machine More... | |
bool | getMachineExtent (int *nxyz) const |
sets the number of unique coordinates in each machine dimension return true if coordinates are available More... | |
bool | getMachineExtentWrapArounds (bool *wrap_around) const |
if the machine has a wrap-around tourus link in each dimension. return true if the information is available More... | |
bool | getMyMachineCoordinate (pcoord_t *xyz) const |
getMyCoordinate function set the machine coordinate xyz of the current process return true if current process' coordinates are available More... | |
bool | getMachineCoordinate (const int rank, pcoord_t *xyz) const |
getCoordinate function set the machine coordinate xyz of any rank process return true if coordinates are available by rank More... | |
bool | getMachineCoordinate (const char *nodename, pcoord_t *xyz) const |
getCoordinate function set the machine coordinate xyz of any node by nodename return true if coordinates are available by nodename More... | |
bool | getAllMachineCoordinatesView (pcoord_t **allCoords) const |
getProcDim function set the coordinates of all ranks allCoords[i][j], i=0,...,getMachineDim(), j=0,...,getNumRanks(), is the i-th dimensional coordinate for rank j. return true if coordinates are available for all ranks More... | |
int | getNumRanks () const |
getNumRanks function return the number of ranks. More... | |
virtual bool | getGroupCount (part_t *grp_count) const |
getGroupCount function return the number of ranks in each group (RCA X-dim, e.g. first dim) More... | |
Additional Inherited Members | |
Protected Attributes inherited from Zoltan2::Machine< pcoord_t, part_t > | |
int | numRanks |
int | myRank |
A Dragonfly (e.g. Cori, Trinity, & Theta) Machine Class for task mapping.
Requires RCA library to run and -D ZOLTAN2_MACHINE_DRAGONFLY:BOOL=ON.
Nodes in Cori, for example, are divided into groups(RCA x_dim) of 384 nodes (16 switch cols * 6 switch rows * 4 nodes/switch) and all groups are connected with an all-to-all connection. Within a group, clusters of 4 nodes are arranged into 6 rows (RCA y_dim) and 16 columns (RCA z_dim). All nodes within a row are connected with an all-to-all. Same for columns. Therefore: (3, 2, 1) -> (3, 4, 11) will take 2 hops. (5, 1, 1) -> (5, 1, 5) will take 1 hop.
We represent the "nearness" by transformation using a high dimensional coord-system of size (1 + N_y + N_z). The first element represents the group, the next N_y elements represent the row, the next N_z elements represent the columns.
(9, 2, 1) in the high-dim space:
| 9 |______ Group, 1 element | 0 | | 0 | | 1 | | 0 | | 0 | | 0 |______ Row, N_y elements | 0 | | 1 | | 0 | | 0 | |...| | 0 | – –______ Col, N_z elements
To assist with MultiJagged coord partitioning we stretch the dimensions. If RCA coords are (3, 2, 14), we first transform the X by
X_new = 3 * X * N_Y * N_Z;
Then transformed coords are (576, 2, 14) and in high-dim space:
(3, 2, 14) -> (864, 2, 14) ->
(864,| 0, 0, 1, 0, 0, 0,| 0, ..., 0, 1, 0)
Now Coordinates are distance sqrt(2) apart if 1 hop, and distance 2 apart if 2 hops.
NOTE: Does not account for dragonfly's dynamic routing
Definition at line 82 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Constructor: Dragonfly (e.g. Cori & Trinity) network machine description;.
Does not do coord transformation.
comm | Communication object. |
Definition at line 93 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Constructor: Dragonfly (e.g. Cori & Trinity) network machine description;.
Does coord transformation if parameter list has a "Machine Optimization Level > 0" parameter set.
comm | Communication object. |
pl | Parameter List |
Definition at line 190 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inlinevirtual |
Definition at line 361 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inlinevirtual |
Definition at line 176 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 386 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 389 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 397 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 409 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 425 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inlineoverridevirtual |
getNumUniqueGroups function return the number of unique Dragonfly network groups in provided allocation.
Equals the length of group_count member data, if available, otherwise we consider the whole allocation to be one group.
Reimplemented from Zoltan2::Machine< pcoord_t, part_t >.
Definition at line 435 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inlineoverride |
Definition at line 440 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 454 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 492 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 505 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 537 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 547 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 563 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inline |
Definition at line 568 of file Zoltan2_MachineDragonflyRCA.hpp.
|
inlineoverridevirtual |
getHopCount function set hops between rank1 and rank2 return true if coordinates are available
Reimplemented from Zoltan2::Machine< pcoord_t, part_t >.
Definition at line 581 of file Zoltan2_MachineDragonflyRCA.hpp.