Tpetra parallel linear algebra
Version of the Day
|
"Hello world!" initialization.
The Tpetra package provides next-generation distributed sparse linear algebra. It includes sparse matrices, vectors, and other linear algebra objects, along with computational kernels. This lesson shows the MPI (or non-MPI) initialization you need to do in order to start using Tpetra. The initialization procedure differs slightly, depending on whether you are writing a code from scratch, or introducing Tpetra into an existing code base. We will give example codes and discussion for the following three use cases:
This section explains how to set up the distributed-memory parallel environment for using Tpetra, in a code which only uses MPI through Trilinos. If you want to introduce Tpetra into an existing MPI application, please see the next section. This example works whether or not Trilinos was built with MPI support.
Tpetra was written for distributed-memory parallel programming. It uses MPI (the Message Passing Interface) for this. However, Tpetra will work correctly whether or not you have built Trilinos with MPI support. It does so by interacting with MPI through an interface called Teuchos::Comm. (If you are familiar with Epetra, this interface is analogous to Epetra_Comm.) If MPI is enabled, then this wraps an MPI_Comm. Otherwise, this is a "serial communicator" with one process, analogous to MPI_COMM_SELF.
Tpetra also provides an MPI initialization interface, Tpetra::ScopeGuard. If you built Trilinos with MPI enabled, but have not yet initialized MPI, ScopeGuard's constructor calls MPI_Init for you, and its destructor calls MPI_Finalize for you. The following code example shows how to initialize MPI (if available) and get a Teuchos::Comm corresponding to MPI_COMM_WORLD. The example works whether or not Trilinos was build with MPI support. If you prefer a C-style initialization and finalization interface, you may use Tpetra::initialize and Tpetra::finalize instead.
Tpetra also works fine in an existing MPI code. For this example, we assume that your code initializes MPI on its own by calling MPI_Init, and calls MPI_Finalize at the end. It also must get an MPI_Comm (an MPI communicator) somewhere, either by using a predefined communicator such as MPI_COMM_WORLD, or by creating a new one.
The first code example in this lesson uses Tpetra::ScopeGuard to initialize MPI. Despite the name, in a non-MPI build of Trilinos, it does nothing. Thus, if you built Trilinos with MPI disabled, you should feel free to use Tpetra::ScopeGuard in your main() function. However, if you are using a build of Trilinos that has MPI enabled, but you don't want to use multiple MPI processes in your application, you may use the three-argument version of Tpetra::ScopeGuard's constructor, that takes an MPI_Comm argument. Just pass in MPI_COMM_SELF to force Tpetra not to use multiple processes. The following example illustrates this.
RCP stands for "reference-counted pointer." It lives in the Teuchos package of Trilinos, and is Trilinos' version of std::shared_ptr or boost::shared_ptr. (There are both historical and technical reasons why we use our own class instead of one of those.) For more details, please refer to the complete reference for the Teuchos memory management classes.
In brief: RCP lets you have the benefits of pointers (the lightweight sharing of data) without the headaches and risks (managing ownership and deallocation). It behaves like a pointer in terms of syntax, but handles deallocation for you. RCP is templated on the type of object to which it points. For example, RCP<T> x
works very much like T* x
(a pointer to a T
), and RCP<const T> y
works like const T* y
(that is, a pointer to a const T
). The dereference (*
) and method call (->
) operators work just like they do with regular pointers. For example,
The "reference-counted" part of RCP means that it automatically handles deallocation, if appropriate. Copying the RCP increments the reference count; allowing it to fall out of scope or assigning Teuchos::null to it decrements the reference count. When the reference count reaches zero, the deallocation function is called. By default, it calls delete
, but you can control this behavior. Reference counting allows you to share pointers between different parts of your code, without needing to worry about what part of the code deallocates the object, or when it gets deallocated.
The Teuchos::rcp function is a "nonmember constructor" template function that returns a newly created RCP of something. Using this function to create an RCP saves some typing, and also may improve exception safety.
Teuchos::Comm is Trilinos' interface to distributed-memory parallel communication. It lives in the Teuchos package of Trilinos. Comm is an abstract base class. The Teuchos::MpiComm and Teuchos::SerialComm classes implement this interface. As the name indicates, MpiComm implements Comm by using MPI calls. SerialComm implements Comm without MPI, as a "communicator" with only one process. (This is more or less equivalent to MPI_COMM_SELF, except without actually using MPI.) All of these classes are templated on the integer type used for MPI function calls. Currently, this is always int
, so you should always use int
as the template parameter.
Since Comm is an abstract base class, you must handle it by pointer or reference. The idiomatic way to create and handle a Comm is by RCP, as an RCP<const Comm<int> >
. This expression should be considered Trilinos' version of the MPI_Comm
opaque handle. Thus,
is "Trilinos-speak" for this:
The normal way to use Comm is to call the nonmember functions in the Teuchos_CommHelpers.hpp header file. We won't cover this in the lesson today.