Use Epetra sparse matrix and dense vector objects to implement a simple iteration (the power method)
Lesson topics
This lesson demonstrates the following:
-
How to construct a Epetra_CrsMatrix (a distributed sparse matrix)
-
How to modify the entries of a previously constructed CrsMatrix
-
How to use CrsMatrix and Vector to implement a simple iterative eigensolver (the power method)
Relation to other lessons
Before starting this lesson, it helps to have finished Epetra Lesson 01: Initialization (how to initialize Epetra) and Epetra_Vector (how to create distributions and distributed vectors). After completing this lesson, you might want to learn about more efficient ways to add or modify entries in a Epetra sparse matrix.
Creating and filling a sparse matrix
This is the first lesson in the usual sequence which covers adding entries to ("filling") a Epetra sparse matrix, and modifying the values of those entries after creating the matrix. Creating and filling a Epetra sparse matrix involves the following steps:
-
Create the CrsMatrix (by calling one of its constructors)
-
Call methods to add entries to the sparse matrix
-
Call the matrix's
FillComplete()
method
We will explain each of these steps in turn.
Creating the CrsMatrix
Epetra's sparse matrices are distributed over one or more parallel processes, just like vectors or other distributed objects. Also just like vectors, you have to tell the sparse matrix its distribution on construction. Unlike vectors, though, sparse matrices have two dimensions over which to be distributed: rows and columns.
Many users are perfectly happy ignoring the column distribution and just distributing the matrix in "one-dimensional" fashion over rows. In that case, you need only supply the "row Map" to the constructor. This implies that for any row which a process owns, that process may insert entries in any column in that row.
Some users want to use the full flexibility of distributing both the rows and columns of the matrix over processes. This "two-dimensional" distribution, if chosen optimally, can significantly reduce the amount of communication needed for distributed-memory parallel sparse matrix-vector multiply. Trilinos packages like Zoltan can help you compute this distribution. In that case, you may give both the "row
Map" and the "column Map" to the constructor. This implies that for any row which a process owns, that process may insert entries in any column in that row which that process owns in its column Map.
Finally, other users already know the structure of the sparse matrix, and just want to fill in values. These users should first create the graph (as an Epetra_CrsGraph), call FillComplete()
on the graph, and then give the graph to the constructor of CrsMatrix. The graph may have either a "1-D" or "2-D" distribution, as mentioned above.
Adding entries to the sparse matrix
Methods of CrsMatrix that start with "Insert" actually change the structure of the sparse matrix. Methods that start with "Replace" or "SumInto" only modify existing values.
Calling \c FillComplete()
Calling FillComplete()
signals that you are done changing the structure (if allowed) or values of the sparse matrix. This is an expensive operation, because it both rearranges local data, and communicates in order to build reusable communication patterns for sparse matrix-vector multiply. You should try to amortize the cost of this operation whenever possible over many sparse matrix-vector multiplies.
FillComplete()
takes two arguments:
-
the domain Map (the distribution of the input vector x in a sparse matrix-vector multiply y = A*x)
-
the range Map (the distribution of the output vector y in a sparse matrix-vector multiply y = A*x)
Both the domain and range Maps must be one-to-one: that is, each global index in the Map must be uniquely owned by one and only one process. You will need to supply these two arguments to FillComplete()
under any of the following conditions:
-
When the row Map is not the same as the range Map (it can't be if the row Map is not one to one)
-
When the domain and range Maps are not equal (e.g., if the matrix is not square)
-
When the domain or range Map as not the same as the row Map
If the domain and range Maps equal the row Map and the row Map is one-to-one, then you may call FillComplete()
with no arguments.
Once you have called FillComplete()
, you may modify the values in the sparse matrix, but you may not modify its graph structure.
Code example
The following code example shows how to fill and compute with a Epetra sparse matrix, using the procedure discussed in the text above.
#include <Epetra_config.h>
#ifdef HAVE_MPI
# include <mpi.h>
# include <Epetra_MpiComm.h>
#else
# include <Epetra_SerialComm.h>
#endif // HAVE_MPI
#include <Epetra_CrsMatrix.h>
#include <Epetra_Map.h>
#include <Epetra_Vector.h>
#include <Epetra_Version.h>
#include <sstream>
#include <stdexcept>
double
const int niters,
const double tolerance);
int
main (int argc, char *argv[])
{
using std::cout;
using std::endl;
#ifdef HAVE_MPI
MPI_Init (&argc, &argv);
#else
#endif // HAVE_MPI
const int myRank = comm.
MyPID ();
const int numProcs = comm.
NumProc ();
if (myRank == 0) {
cout << Epetra_Version () << endl << endl
<< "Total number of processes: " << numProcs << endl;
}
#ifdef EPETRA_NO_32BIT_GLOBAL_INDICES
typedef long long global_ordinal_type;
#else
typedef int global_ordinal_type;
#endif // EPETRA_NO_32BIT_GLOBAL_INDICES
const global_ordinal_type numGlobalElements = 50;
const global_ordinal_type indexBase = 0;
Epetra_Map map (numGlobalElements, indexBase, comm);
const int numMyElements = map.NumMyElements ();
global_ordinal_type* myGlobalElements = NULL;
#ifdef EPETRA_NO_32BIT_GLOBAL_INDICES
myGlobalElements = map.MyGlobalElements64 ();
#else
myGlobalElements = map.MyGlobalElements ();
#endif // EPETRA_NO_32BIT_GLOBAL_INDICES
if (numMyElements > 0 && myGlobalElements == NULL) {
throw std::logic_error ("Failed to get the list of global indices");
}
if (myRank == 0) {
cout << endl << "Creating the sparse matrix" << endl;
}
int lclerr = 0;
double tempVals[3];
global_ordinal_type tempGblInds[3];
for (int i = 0; i < numMyElements; ++i) {
if (myGlobalElements[i] == 0) {
tempVals[0] = 2.0;
tempVals[1] = -1.0;
tempGblInds[0] = myGlobalElements[i];
tempGblInds[1] = myGlobalElements[i] + 1;
if (lclerr == 0) {
lclerr = A.InsertGlobalValues (myGlobalElements[i], 2, tempVals, tempGblInds);
}
if (lclerr != 0) {
break;
}
}
else if (myGlobalElements[i] == numGlobalElements - 1) {
tempVals[0] = -1.0;
tempVals[1] = 2.0;
tempGblInds[0] = myGlobalElements[i] - 1;
tempGblInds[1] = myGlobalElements[i];
if (lclerr == 0) {
lclerr = A.InsertGlobalValues (myGlobalElements[i], 2, tempVals, tempGblInds);
}
if (lclerr != 0) {
break;
}
}
else {
tempVals[0] = -1.0;
tempVals[1] = 2.0;
tempVals[2] = -1.0;
tempGblInds[0] = myGlobalElements[i] - 1;
tempGblInds[1] = myGlobalElements[i];
tempGblInds[2] = myGlobalElements[i] + 1;
if (lclerr == 0) {
lclerr = A.InsertGlobalValues (myGlobalElements[i], 3, tempVals, tempGblInds);
}
if (lclerr != 0) {
break;
}
}
}
int gblerr = 0;
(void) comm.
MaxAll (&lclerr, &gblerr, 1);
if (gblerr != 0) {
throw std::runtime_error ("Some process failed to insert an entry.");
}
gblerr = A.FillComplete ();
if (gblerr != 0) {
std::ostringstream os;
os << "A.FillComplete() failed with error code " << gblerr << ".";
throw std::runtime_error (os.str ());
}
const int niters = 500;
const double tolerance = 1.0e-2;
double lambda = powerMethod (A, niters, tolerance);
if (myRank == 0) {
cout << endl << "Estimated max eigenvalue: " << lambda << endl;
}
if (myRank == 0) {
cout << endl << "Increasing magnitude of A(0,0), solving again" << endl;
}
if (A.RowMap ().MyGID (0)) {
const global_ordinal_type gidOfFirstRow = 0;
const int lidOfFirstRow = A.RowMap ().LID (gidOfFirstRow);
int numEntriesInRow = A.NumMyEntries (lidOfFirstRow);
double* rowvals = new double [numEntriesInRow];
global_ordinal_type* rowinds = new global_ordinal_type [numEntriesInRow];
if (lclerr == 0) {
lclerr = A.ExtractGlobalRowCopy (gidOfFirstRow,
numEntriesInRow, numEntriesInRow,
rowvals, rowinds);
}
if (lclerr == 0) {
for (int i = 0; i < numEntriesInRow; ++i) {
if (rowinds[i] == gidOfFirstRow) {
rowvals[i] *= 10.0;
}
}
if (lclerr == 0) {
lclerr = A.ReplaceGlobalValues (gidOfFirstRow, numEntriesInRow,
rowvals, rowinds);
}
}
if (rowvals != NULL) {
delete [] rowvals;
}
if (rowinds != NULL) {
delete [] rowinds;
}
}
gblerr = 0;
(void) comm.
MaxAll (&lclerr, &gblerr, 1);
if (gblerr != 0) {
throw std::runtime_error ("One of the owning process(es) of global "
"row 0 failed to replace an entry.");
}
lambda = powerMethod (A, niters, tolerance);
if (myRank == 0) {
cout << endl << "Estimated max eigenvalue: " << lambda << endl;
}
if (myRank == 0) {
cout << "End Result: TEST PASSED" << endl;
}
#ifdef HAVE_MPI
(void) MPI_Finalize ();
#endif // HAVE_MPI
return 0;
}
double
const int niters,
const double tolerance)
{
using std::cout;
using std::endl;
const int myRank = comm.
MyPID ();
int lclerr = 0;
lclerr = z.Random ();
double lambda = 0.0;
double normz = 0.0;
double residual = 0.0;
const double zero = 0.0;
const double one = 1.0;
const int reportFrequency = 10;
for (int iter = 0; iter < niters; ++iter) {
lclerr = (lclerr == 0) ? z.Norm2 (&normz) : lclerr;
lclerr = (lclerr == 0) ? q.Scale (one / normz, z) : lclerr;
lclerr = (lclerr == 0) ? A.
Apply (q, z) : lclerr;
lclerr = (lclerr == 0) ? q.Dot (z, &lambda) : lclerr;
if (iter % reportFrequency == 0 || iter + 1 == niters) {
lclerr = (lclerr == 0) ? resid.Update (one, z, -lambda, q, zero) : lclerr;
lclerr = (lclerr == 0) ? resid.Norm2 (&residual) : lclerr;
if (myRank == 0) {
cout << "Iteration " << iter << ":" << endl
<< "- lambda = " << lambda << endl
<< "- ||A*q - lambda*q||_2 = " << residual << endl;
}
}
if (residual < tolerance) {
if (myRank == 0) {
cout << "Converged after " << iter << " iterations" << endl;
}
break;
} else if (iter + 1 == niters) {
if (myRank == 0) {
cout << "Failed to converge after " << niters << " iterations" << endl;
}
break;
}
}
int gblerr = 0;
(void) comm.
MaxAll (&lclerr, &gblerr, 1);
if (gblerr != 0) {
throw std::runtime_error ("The power method failed in some way.");
}
return lambda;
}