doc/html/Amesos__Merikos_8h_source.html

 This file is out of date.  It has not been refactored to use Amesos_Status.


 /*

 Task list:

     Amesos_Merikos.h

     Amesos_Merikos.cpp

       Partition the matrix - store L as L^T?

       Build tree

       Initial data redistribution

       Change row and column ownership (pass them up to the parent)

     Amesos_Component_Solver.h

     Amesos_BTF.h

     Amesos_BTF.cpp


   Communications issues/challenges:

   **  Redistributing the original matrix to the arrowhead form that we need, options:

       1)  Two matrices:  L^T and U

       2)  One matrix:  U | L^T

       3)  Intermediate "fat" matrix - the way that I did Scalapack


   **  Adding the Schur complements SC_0 and SC_1 to the original

           trailing matirx A_2, owned by the parent

       1)  Precompute the final size and shape of A_2 + SC_0 + SC_1

       2)  Perform A_2 + SC_0 + SC_1 in an empty matrix of size n by n

           and extract the non-empty rows.

       CHALLENGES:

         A)  Only process 0/1 knows the size and map for SC_0/SC_1

         B)  It would be nice to allow SC_0 and SC_1 to be sent as soon as

       they are available

   C)  It would be nice to have just one copy of the matrix on the

             parent.  Hence, it would be nice to know the shape of

       A_2 + SC_0 + SC_1 in advance.

         D)  An import would do the trick provided that we know both maps

             in advance.  But, neither map is available to us in advance.

             The original map (which has SC_0 on process 0 and SC_1 on

       process 1) is not known

       QUESTION:

         Should the maps be in some global address space or should they be

   in a local address space?

   I'd like to keep them in the global address space as long as possible,

   but we can't do the import of the A_2 + SC_0 + SC_1 in a global

   address space because that would require a map that changes at each


   **  Redistributing the right hand side vector, b

       If we create a map that reflects the post-pivoting reality, assigning

       each row of U and each column of L to the process that owns the diagonal

       entry, we can redistribute the right hand side vector, b, to the

       processes where the values of b will first be used, in a single, efficient,

       import operation.


 Observations:

 1)  Although this algorithm is recursive, a non-recursive implementation

     might be cleaner.  If it is done recursively, it should be done in place,

     i.e. any data movement of the matrix itself should have been done in

     advance.

 2)  There are two choices for the basic paradigm for parallelism.  Consider

     a two level bisection of the matrix, yielding seven tasks or diaganol

     blocks::  T0, T1, T01, T2, T3, T23 and T0123.  In both paradigms,

     T0, T1, T2 and T3 would each

     be handled by a single process.  Up until now, we have been assuming

     that T01 would be handled by processes 0 and/or 1 while T23 would be

     handled by processes 2 and/or 3.  The other option is to arrange the

     tasks/diagonal blocks as follows:  T0, T1, T2, T3, T01, T23, T0123 and

     treat the last three blocks:  T01, T23 and T0123 as a single block to be

     handled by all four processes.  This second paradigm includes an

     additional synchronization, but may allow a better partitioning of

     the remaining matrix because the resulting schur complement is now

     known.   This improved partitioning will also improve the refactorization

     (i.e. pivotless factorization).  The second paradigm may also allow for

     better load balancing.  For example, after using recursive minimum

     degree bi-section (or any other scheme) to partition the matrix, one could

     run a peephole optimization pass that would look for individuals blocks

     that could be moved from the largest partition to a smaller one.  Finally,

     if it is clear that a given partition is going to be the slowest, it might

     make sense to shift some rows/columns off of it into the splitter just

     for load balancing.

 3)  It seems possible that Merikos would be a cleaner code if rows

     which are shared by multiple processes are split such that each row

     resides entirely within a given process.


 4)  Support for pivotless refactorization is important.

 5)  There is no mention of the required row and column permutations.

 6)  Amesos_Merikos only needs to support the Amesos_Component interface if

     it will call itself recursively on the subblocks.

 7)  Perhaps Amesos_Component.h should be an added interface.  Instead

     of replacing Amesos_BaseSolver.h, maybe it should add functionality

     to it.

 */

 // @HEADER

 // ***********************************************************************

 //

 //                Amesos: Direct Sparse Solver Package

 //                 Copyright (2004) Sandia Corporation

 //

 // Under terms of Contract DE-AC04-94AL85000, there is a non-exclusive

 // license for use of this work by or on behalf of the U.S. Government.

 //

 // This library is free software; you can redistribute it and/or modify

 // it under the terms of the GNU Lesser General Public License as

 // published by the Free Software Foundation; either version 2.1 of the

 // License, or (at your option) any later version.

 //

 // This library is distributed in the hope that it will be useful, but

 // WITHOUT ANY WARRANTY; without even the implied warranty of

 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

 // Lesser General Public License for more details.

 //

 // You should have received a copy of the GNU Lesser General Public

 // License along with this library; if not, write to the Free Software

 // Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301

 // USA

 // Questions? Contact Michael A. Heroux (maherou@sandia.gov)

 //

 // ***********************************************************************

 // @HEADER


 #ifndef _AMESOS_MERIKOS_H_

 #define _AMESOS_MERIKOS_H_


 #include "Amesos_ConfigDefs.h"

 #include "Amesos_BaseSolver.h"

 #include "Epetra_LinearProblem.h"

 #include "Epetra_Time.h"

 #ifdef EPETRA_MPI

 #include "Epetra_MpiComm.h"

 #else

 #include "Epetra_Comm.h"

 #endif

 #include "Epetra_CrsGraph.h"


 class Amesos_Merikos: public Amesos_BaseSolver {


 public:


   Amesos_Merikos(const Epetra_LinearProblem& LinearProblem );


   ~Amesos_Merikos(void);


     int RedistributeA() ;

     int ConvertToScalapack() ;

     int PerformNumericFactorization() ;

     int SymbolicFactorization() ;


     int NumericFactorization() ;


     int LSolve();


     int USolve();


     int Solve();


   const Epetra_LinearProblem *GetProblem() const { return(Problem_); };


   bool MatrixShapeOK() const ;


   int SetUseTranspose(bool UseTranspose) {UseTranspose_ = UseTranspose; return(0);};


   bool UseTranspose() const {return(UseTranspose_);};


   const Epetra_Comm & Comm() const {return(GetProblem()->GetOperator()->Comm());};


   int SetParameters( Teuchos::ParameterList &ParameterList )  ;


   int NumSymbolicFact() const { return( NumSymbolicFact_ ); }


   int NumNumericFact() const { return( NumNumericFact_ ); }


   int NumSolve() const { return( NumSolve_ ); }


   void PrintTiming();


   void PrintStatus();


  protected:


   bool UseTranspose_;

   const Epetra_LinearProblem * Problem_;


   Epetra_CrsMatrix *L;

   Epetra_CrsMatrix *U;


   bool PrintTiming_;

   bool PrintStatus_;

   bool ComputeVectorNorms_;

   bool ComputeTrueResidual_;


   int verbose_;

   int debug_;


   // some timing internal, copied from MUMPS

   double ConTime_;                        // time to convert to MERIKOS format

   double SymTime_;                        // time for symbolic factorization

   double NumTime_;                        // time for numeric factorization

   double SolTime_;                        // time for solution

   double VecTime_;                        // time to redistribute vectors

   double MatTime_;                        // time to redistribute matrix


   int NumSymbolicFact_;

   int NumNumericFact_;

   int NumSolve_;


   Epetra_Time * Time_;


 //

 //  These allow us to use the Scalapack based Merikos code

 //

   Epetra_Map *ScaLAPACK1DMap_ ;          //  Points to a 1D Map which matches a ScaLAPACK 1D

                                          //  blocked (not block cyclic) distribution

   Epetra_CrsMatrix *ScaLAPACK1DMatrix_ ; //  Points to a  ScaLAPACK 1D

                                          //  blocked (not block cyclic) distribution

   Epetra_Map *VectorMap_ ;               //  Points to a Map for vectors X and B

   std::vector<double> DenseA_;                //  The data in a ScaLAPACK 1D blocked format

   std::vector<int> Ipiv_ ;                    //  ScaLAPACK pivot information

   int NumOurRows_ ;

   int NumOurColumns_ ;


   //

   //  Control of the data distribution

   //

   bool TwoD_distribution_;  // True if 2D data distribution is used

   int grid_nb_;             // Row and Column blocking factor (only used in 2D distribution)

   int mypcol_;              // Process column in the ScaLAPACK2D grid

   int myprow_;              // Process row in the ScaLAPACK2D grid

   Epetra_CrsMatrix* FatOut_;//


   //

   //  Blocking factors (For both 1D and 2D data distributions)

   //

   int nb_;

   int lda_;


 int iam_;

 int nprow_;

 int npcol_;

 int NumGlobalElements_;

 int m_per_p_;


 };  // End of  class Amesos_Merikos

 #endif /* _AMESOS_MERIKOS_H_ */

Amesos_Merikos::debug_
int debug_
Definition: Amesos_Merikos.h:356

Amesos_Merikos::PrintStatus
void PrintStatus()
Print information about the factorization and solution phases.

Amesos_Merikos::PrintTiming_
bool PrintTiming_
Definition: Amesos_Merikos.h:350

Amesos_Merikos::grid_nb_
int grid_nb_
Definition: Amesos_Merikos.h:392

Epetra_Map

Amesos_Merikos::ScaLAPACK1DMap_
Epetra_Map * ScaLAPACK1DMap_
Definition: Amesos_Merikos.h:377

Amesos_Merikos::NumericFactorization
int NumericFactorization()
Performs NumericFactorization on the matrix A.

Amesos_Merikos::VectorMap_
Epetra_Map * VectorMap_
Definition: Amesos_Merikos.h:381

Amesos_Merikos::LSolve
int LSolve()
Solves L X = B.

Epetra_LinearProblem.h

Amesos_Merikos::SetUseTranspose
int SetUseTranspose(bool UseTranspose)
SetUseTranpose() controls whether to compute AX=B or ATX = B.
Definition: Amesos_Merikos.h:311

Amesos_Merikos::npcol_
int npcol_
Definition: Amesos_Merikos.h:405

Amesos_Merikos::Time_
Epetra_Time * Time_
Definition: Amesos_Merikos.h:370

Amesos_Merikos::NumSolve
int NumSolve() const
Returns the number of solves performed by this object.
Definition: Amesos_Merikos.h:332

Amesos_Merikos::ComputeVectorNorms_
bool ComputeVectorNorms_
Definition: Amesos_Merikos.h:352

Amesos_Merikos::SymbolicFactorization
int SymbolicFactorization()
Performs SymbolicFactorization on the matrix A.

Amesos_Merikos::Comm
const Epetra_Comm & Comm() const
Returns a pointer to the Epetra_Comm communicator associated with this matrix.
Definition: Amesos_Merikos.h:317

Amesos_ConfigDefs.h

Epetra_CrsGraph.h

Amesos_BaseSolver.h

Amesos_Merikos::ScaLAPACK1DMatrix_
Epetra_CrsMatrix * ScaLAPACK1DMatrix_
Definition: Amesos_Merikos.h:379

Amesos_Merikos::PerformNumericFactorization
int PerformNumericFactorization()

Amesos_Merikos::lda_
int lda_
Definition: Amesos_Merikos.h:401

Amesos_Merikos::NumSymbolicFact_
int NumSymbolicFact_
Definition: Amesos_Merikos.h:366

Amesos_Merikos::iam_
int iam_
Definition: Amesos_Merikos.h:403

Amesos_Merikos::MatrixShapeOK
bool MatrixShapeOK() const
Returns true if MERIKOS can handle this matrix shape.

Amesos_Merikos::VecTime_
double VecTime_
Definition: Amesos_Merikos.h:363

Epetra_Time

Amesos_Merikos::GetProblem
const Epetra_LinearProblem * GetProblem() const
Get a pointer to the Problem.
Definition: Amesos_Merikos.h:300

Epetra_Comm.h

Amesos_Merikos::nprow_
int nprow_
Definition: Amesos_Merikos.h:404

Amesos_Merikos::L
Epetra_CrsMatrix * L
Definition: Amesos_Merikos.h:347

Amesos_Merikos::ConTime_
double ConTime_
Definition: Amesos_Merikos.h:359

Epetra_Comm

Amesos_Merikos::mypcol_
int mypcol_
Definition: Amesos_Merikos.h:393

Amesos_Merikos::UseTranspose
bool UseTranspose() const
Returns the current UseTranspose setting.
Definition: Amesos_Merikos.h:314

Amesos_Merikos::myprow_
int myprow_
Definition: Amesos_Merikos.h:394

Amesos_Merikos::NumOurRows_
int NumOurRows_
Definition: Amesos_Merikos.h:384

Amesos_Merikos::~Amesos_Merikos
~Amesos_Merikos(void)
Amesos_Merikos Destructor.

Amesos_Merikos::PrintTiming
void PrintTiming()
Print timing information.

Amesos_Merikos::DenseA_
std::vector< double > DenseA_
Definition: Amesos_Merikos.h:382

Amesos_Merikos::NumSymbolicFact
int NumSymbolicFact() const
Returns the number of symbolic factorizations performed by this object.
Definition: Amesos_Merikos.h:326

Amesos_Merikos::NumTime_
double NumTime_
Definition: Amesos_Merikos.h:361

Amesos_Status
Amesos_Status: Container for some status variables.
Definition: Amesos_Status.h:20

Amesos_Merikos::NumNumericFact_
int NumNumericFact_
Definition: Amesos_Merikos.h:367

Amesos_Merikos::Problem_
const Epetra_LinearProblem * Problem_
Definition: Amesos_Merikos.h:345

Amesos_Merikos::verbose_
int verbose_
Definition: Amesos_Merikos.h:355

Amesos_Merikos::USolve
int USolve()
Solves U X = B.

Amesos_Merikos::UseTranspose_
bool UseTranspose_
Definition: Amesos_Merikos.h:344

Amesos_Merikos::PrintStatus_
bool PrintStatus_
Definition: Amesos_Merikos.h:351

Teuchos::ParameterList

Epetra_Time.h

Epetra_MpiComm.h

Amesos_Merikos::NumOurColumns_
int NumOurColumns_
Definition: Amesos_Merikos.h:385

Amesos_Merikos::m_per_p_
int m_per_p_
Definition: Amesos_Merikos.h:407

Amesos_Merikos::NumGlobalElements_
int NumGlobalElements_
Definition: Amesos_Merikos.h:406

Amesos_Merikos::RedistributeA
int RedistributeA()
Performs SymbolicFactorization on the matrix A.

Amesos_Merikos::Ipiv_
std::vector< int > Ipiv_
Definition: Amesos_Merikos.h:383

Amesos_Merikos::Amesos_Merikos
Amesos_Merikos(const Epetra_LinearProblem &LinearProblem)
Amesos_Merikos Constructor.

Amesos_Merikos::Solve
int Solve()
Solves A X = B.

Epetra_CrsMatrix

Amesos_Merikos::FatOut_
Epetra_CrsMatrix * FatOut_
Definition: Amesos_Merikos.h:395

Amesos_Merikos::TwoD_distribution_
bool TwoD_distribution_
Definition: Amesos_Merikos.h:391

Amesos_Merikos::NumNumericFact
int NumNumericFact() const
Returns the number of numeric factorizations performed by this object.
Definition: Amesos_Merikos.h:329

Amesos_Merikos::MatTime_
double MatTime_
Definition: Amesos_Merikos.h:364

Amesos_Merikos::NumSolve_
int NumSolve_
Definition: Amesos_Merikos.h:368

Amesos_Merikos::SolTime_
double SolTime_
Definition: Amesos_Merikos.h:362

Amesos_Merikos
Amesos_Merikos: A parallel divide and conquer solver.
Definition: Amesos_Merikos.h:154

Epetra_LinearProblem

Amesos_Merikos::U
Epetra_CrsMatrix * U
Definition: Amesos_Merikos.h:348

Amesos_BaseSolver
Amesos_BaseSolver: A pure virtual class for direct solution of real-valued double-precision operators...
Definition: Amesos_BaseSolver.h:223

Amesos_Merikos::SymTime_
double SymTime_
Definition: Amesos_Merikos.h:360

Amesos_Merikos::nb_
int nb_
Definition: Amesos_Merikos.h:400

Amesos_Merikos::ComputeTrueResidual_
bool ComputeTrueResidual_
Definition: Amesos_Merikos.h:353

Amesos_Merikos::ConvertToScalapack
int ConvertToScalapack()

Amesos_Merikos::SetParameters
int SetParameters(Teuchos::ParameterList &ParameterList)
Updates internal variables.