Amesos Package Browser (Single Doxygen Collection)  Development
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Amesos_Merikos.h
Go to the documentation of this file.
1 This file is out of date. It has not been refactored to use Amesos_Status.
2 
3 /*
4 Task list:
5  Amesos_Merikos.h
6  Amesos_Merikos.cpp
7  Partition the matrix - store L as L^T?
8  Build tree
9  Initial data redistribution
10  Change row and column ownership (pass them up to the parent)
11  Amesos_Component_Solver.h
12  Amesos_BTF.h
13  Amesos_BTF.cpp
14 
15 
16 
17  Communications issues/challenges:
18  ** Redistributing the original matrix to the arrowhead form that we need, options:
19  1) Two matrices: L^T and U
20  2) One matrix: U | L^T
21  3) Intermediate "fat" matrix - the way that I did Scalapack
22 
23  ** Adding the Schur complements SC_0 and SC_1 to the original
24  trailing matirx A_2, owned by the parent
25  1) Precompute the final size and shape of A_2 + SC_0 + SC_1
26  2) Perform A_2 + SC_0 + SC_1 in an empty matrix of size n by n
27  and extract the non-empty rows.
28  CHALLENGES:
29  A) Only process 0/1 knows the size and map for SC_0/SC_1
30  B) It would be nice to allow SC_0 and SC_1 to be sent as soon as
31  they are available
32  C) It would be nice to have just one copy of the matrix on the
33  parent. Hence, it would be nice to know the shape of
34  A_2 + SC_0 + SC_1 in advance.
35  D) An import would do the trick provided that we know both maps
36  in advance. But, neither map is available to us in advance.
37  The original map (which has SC_0 on process 0 and SC_1 on
38  process 1) is not known
39  QUESTION:
40  Should the maps be in some global address space or should they be
41  in a local address space?
42  I'd like to keep them in the global address space as long as possible,
43  but we can't do the import of the A_2 + SC_0 + SC_1 in a global
44  address space because that would require a map that changes at each
45 
46  ** Redistributing the right hand side vector, b
47  If we create a map that reflects the post-pivoting reality, assigning
48  each row of U and each column of L to the process that owns the diagonal
49  entry, we can redistribute the right hand side vector, b, to the
50  processes where the values of b will first be used, in a single, efficient,
51  import operation.
52 
53 
54 Observations:
55 1) Although this algorithm is recursive, a non-recursive implementation
56  might be cleaner. If it is done recursively, it should be done in place,
57  i.e. any data movement of the matrix itself should have been done in
58  advance.
59 2) There are two choices for the basic paradigm for parallelism. Consider
60  a two level bisection of the matrix, yielding seven tasks or diaganol
61  blocks:: T0, T1, T01, T2, T3, T23 and T0123. In both paradigms,
62  T0, T1, T2 and T3 would each
63  be handled by a single process. Up until now, we have been assuming
64  that T01 would be handled by processes 0 and/or 1 while T23 would be
65  handled by processes 2 and/or 3. The other option is to arrange the
66  tasks/diagonal blocks as follows: T0, T1, T2, T3, T01, T23, T0123 and
67  treat the last three blocks: T01, T23 and T0123 as a single block to be
68  handled by all four processes. This second paradigm includes an
69  additional synchronization, but may allow a better partitioning of
70  the remaining matrix because the resulting schur complement is now
71  known. This improved partitioning will also improve the refactorization
72  (i.e. pivotless factorization). The second paradigm may also allow for
73  better load balancing. For example, after using recursive minimum
74  degree bi-section (or any other scheme) to partition the matrix, one could
75  run a peephole optimization pass that would look for individuals blocks
76  that could be moved from the largest partition to a smaller one. Finally,
77  if it is clear that a given partition is going to be the slowest, it might
78  make sense to shift some rows/columns off of it into the splitter just
79  for load balancing.
80 3) It seems possible that Merikos would be a cleaner code if rows
81  which are shared by multiple processes are split such that each row
82  resides entirely within a given process.
83 
84 4) Support for pivotless refactorization is important.
85 5) There is no mention of the required row and column permutations.
86 6) Amesos_Merikos only needs to support the Amesos_Component interface if
87  it will call itself recursively on the subblocks.
88 7) Perhaps Amesos_Component.h should be an added interface. Instead
89  of replacing Amesos_BaseSolver.h, maybe it should add functionality
90  to it.
91 */
92 // @HEADER
93 // ***********************************************************************
94 //
95 // Amesos: Direct Sparse Solver Package
96 // Copyright (2004) Sandia Corporation
97 //
98 // Under terms of Contract DE-AC04-94AL85000, there is a non-exclusive
99 // license for use of this work by or on behalf of the U.S. Government.
100 //
101 // This library is free software; you can redistribute it and/or modify
102 // it under the terms of the GNU Lesser General Public License as
103 // published by the Free Software Foundation; either version 2.1 of the
104 // License, or (at your option) any later version.
105 //
106 // This library is distributed in the hope that it will be useful, but
107 // WITHOUT ANY WARRANTY; without even the implied warranty of
108 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
109 // Lesser General Public License for more details.
110 //
111 // You should have received a copy of the GNU Lesser General Public
112 // License along with this library; if not, write to the Free Software
113 // Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301
114 // USA
115 // Questions? Contact Michael A. Heroux (maherou@sandia.gov)
116 //
117 // ***********************************************************************
118 // @HEADER
119 
120 #ifndef _AMESOS_MERIKOS_H_
121 #define _AMESOS_MERIKOS_H_
122 
123 #if defined(Amesos_SHOW_DEPRECATED_WARNINGS)
124 #ifdef __GNUC__
125 #warning "The Amesos package is deprecated"
126 #endif
127 #endif
128 
129 #include "Amesos_ConfigDefs.h"
130 #include "Amesos_BaseSolver.h"
131 #include "Epetra_LinearProblem.h"
132 #include "Epetra_Time.h"
133 #ifdef EPETRA_MPI
134 #include "Epetra_MpiComm.h"
135 #else
136 #include "Epetra_Comm.h"
137 #endif
138 #include "Epetra_CrsGraph.h"
139 
140 
142 
161 
162 public:
163 
165 
170  Amesos_Merikos(const Epetra_LinearProblem& LinearProblem );
171 
173 
175  ~Amesos_Merikos(void);
177 
179 
181 
187  int RedistributeA() ;
188  int ConvertToScalapack() ;
190  int SymbolicFactorization() ;
191 
193 
217  int NumericFactorization() ;
218 
220 
240  int LSolve();
241 
242 
244 
264  int USolve();
265 
267 
296  int Solve();
297 
298 
299 
301 
303 
304 
306  const Epetra_LinearProblem *GetProblem() const { return(Problem_); };
307 
309 
312  bool MatrixShapeOK() const ;
313 
315 
318 
320  bool UseTranspose() const {return(UseTranspose_);};
321 
323  const Epetra_Comm & Comm() const {return(GetProblem()->GetOperator()->Comm());};
324 
326 
329  int SetParameters( Teuchos::ParameterList &ParameterList ) ;
330 
332  int NumSymbolicFact() const { return( NumSymbolicFact_ ); }
333 
335  int NumNumericFact() const { return( NumNumericFact_ ); }
336 
338  int NumSolve() const { return( NumSolve_ ); }
339 
341  void PrintTiming();
342 
344  void PrintStatus();
345 
347 
348  protected:
349 
352 
355 
360 
361  int verbose_;
362  int debug_;
363 
364  // some timing internal, copied from MUMPS
365  double ConTime_; // time to convert to MERIKOS format
366  double SymTime_; // time for symbolic factorization
367  double NumTime_; // time for numeric factorization
368  double SolTime_; // time for solution
369  double VecTime_; // time to redistribute vectors
370  double MatTime_; // time to redistribute matrix
371 
374  int NumSolve_;
375 
377 
378 
379 
380 //
381 // These allow us to use the Scalapack based Merikos code
382 //
383  Epetra_Map *ScaLAPACK1DMap_ ; // Points to a 1D Map which matches a ScaLAPACK 1D
384  // blocked (not block cyclic) distribution
385  Epetra_CrsMatrix *ScaLAPACK1DMatrix_ ; // Points to a ScaLAPACK 1D
386  // blocked (not block cyclic) distribution
387  Epetra_Map *VectorMap_ ; // Points to a Map for vectors X and B
388  std::vector<double> DenseA_; // The data in a ScaLAPACK 1D blocked format
389  std::vector<int> Ipiv_ ; // ScaLAPACK pivot information
392 
393 
394  //
395  // Control of the data distribution
396  //
397  bool TwoD_distribution_; // True if 2D data distribution is used
398  int grid_nb_; // Row and Column blocking factor (only used in 2D distribution)
399  int mypcol_; // Process column in the ScaLAPACK2D grid
400  int myprow_; // Process row in the ScaLAPACK2D grid
402 
403  //
404  // Blocking factors (For both 1D and 2D data distributions)
405  //
406  int nb_;
407  int lda_;
408 
409 int iam_;
410 int nprow_;
411 int npcol_;
414 
415 
416 
417 }; // End of class Amesos_Merikos
418 #endif /* _AMESOS_MERIKOS_H_ */
void PrintStatus()
Print information about the factorization and solution phases.
Epetra_Map * ScaLAPACK1DMap_
int NumericFactorization()
Performs NumericFactorization on the matrix A.
Epetra_Map * VectorMap_
int LSolve()
Solves L X = B.
int SetUseTranspose(bool UseTranspose)
SetUseTranpose() controls whether to compute AX=B or ATX = B.
Epetra_Time * Time_
int NumSolve() const
Returns the number of solves performed by this object.
int SymbolicFactorization()
Performs SymbolicFactorization on the matrix A.
const Epetra_Comm & Comm() const
Returns a pointer to the Epetra_Comm communicator associated with this matrix.
Epetra_CrsMatrix * ScaLAPACK1DMatrix_
int PerformNumericFactorization()
bool MatrixShapeOK() const
Returns true if MERIKOS can handle this matrix shape.
const Epetra_LinearProblem * GetProblem() const
Get a pointer to the Problem.
Epetra_CrsMatrix * L
bool UseTranspose() const
Returns the current UseTranspose setting.
~Amesos_Merikos(void)
Amesos_Merikos Destructor.
void PrintTiming()
Print timing information.
std::vector< double > DenseA_
int NumSymbolicFact() const
Returns the number of symbolic factorizations performed by this object.
Amesos_Status: Container for some status variables.
Definition: Amesos_Status.h:26
const Epetra_LinearProblem * Problem_
int USolve()
Solves U X = B.
int RedistributeA()
Performs SymbolicFactorization on the matrix A.
std::vector< int > Ipiv_
Amesos_Merikos(const Epetra_LinearProblem &LinearProblem)
Amesos_Merikos Constructor.
int Solve()
Solves A X = B.
Epetra_CrsMatrix * FatOut_
int NumNumericFact() const
Returns the number of numeric factorizations performed by this object.
Amesos_Merikos: A parallel divide and conquer solver.
Epetra_CrsMatrix * U
Amesos_BaseSolver: A pure virtual class for direct solution of real-valued double-precision operators...
int ConvertToScalapack()
int SetParameters(Teuchos::ParameterList &ParameterList)
Updates internal variables.