Overview of Iterative Linear System Solver Packages
Chapter 3 The Packages
3.1 Aztec
Aztec is a parallel library of iterative solution methods and preconditioners.
Its main aim is to provide tools that facilitate handling the distributed
data structures. For this, there is a collection of data transformation
tools, as well as query functions of the data structures.
3.1.1 Basic information
Available from Web site at http://www.cs.sandia.gov/CRF/aztec1.html;
registration required
Author(s) Scott A. Hutchinson, John N. Shadid, Ray S. Tuminaro
Latest version 1.1, October 1995
Status Unknown
3.1.2 Contents
Iterative methods CG, GMRES, CGS, TFQMR, BiCGstab, Direct solve
(on one processor only).
Preconditioners Block Jacobi with ILU on the subblocks; also
Additive Schwarz, with an overlap limited to one.
Data structures Distributed variants of Compressed Row and Variable
Block Row; see below.
Manual User's Guide, 43 pages
Example codes Yes
3.1.3 Parallelism and data layout
Aztec can handle arbitrary assignments of matrix rows to processors. Since
this quite general arrangement makes it harder to construct the local numbering
of the matrices (see section 2.2.2), the user can supply the matrix with
global numbering and a preprocessing routine performs the localisation.
However, the user does have to supply all rows on the appropriate processor.
3.1.4 Other
Aztec requires routines from Blas,
Lapack, Linpack, Y12m.
3.1.5 Installation
As part of the installation, Aztec requests auxiliary routines from netlib,
and places them in a subdirectory to be compiled later. This precludes
native Blas
or Lapack
libraries to be used. There is no hint on how to substitute these. Aztec
uses a distributed version of the MSR (Modified Sparse Row) storage format,
which is close to the Compressed Row Storage format. I find this an unfortunate
choice:
-
I do not see any advantage over the CRS format.
-
While conversion to and from CRS is straightforward, the arrays need to
be longer by one position. This means that interfacing Aztec to a code
using CRS entails deallocating and reallocating the arrays.
-
The extra storage location in the real array is not used; the location
in the integer array duplicates the scalar parameter giving the number
of rows.
3.2 BlockSolve95
The main focus of the BlockSolve package is the implementation of SSOR
and ILU preconditioners based on a parallel multi-colouring algorithm by
the authors. The algorithm first computes a quotient graph of the matrix
graph by eliminating cliques and identical nodes. Operations on the nodes
of this quotient graph then become of BLAS2/3 type, leading to a high performance.
BlockSolve can be used with the Petsc package; section 3.10.
3.2.1 Basic information
Available from Web site at http://www.mcs.anl.gov/blocksolve95/.
Author(s) Mark T. Jones and Paul E. Plassmann
Latest version 3.0, June 1997
Status Being maintained and further developed
3.2.2 Contents
Iterative methods CG, SYMMLQ, GMRES are provided, though these are
not the main focus of package; BlockSolve can be interfaced to Petsc for
more iterative methods.
Preconditioners diagonal scaling, block Jacobi with blocks corresponding
to the cliques factored out of the quotient graph, incomplete LU and Cholesky.
Data structures Internal, as a C structure..
Access operations on the data structure None; the definition
of the structure is given in the manual.
Operations using the data structure Iterative solution of the
system and of a shifted system, application of the matrix and the preconditioner.
(Note: The separate application of the matrix and the preconditioner
are not documented in the manual.)
Manual Users Manual; 34 pages. This is a complete reference; the
user is suggested to use the example codes as templates for applications.
Example codes Yes
3.2.3 Parallelism and data layout
The user has two ways to pass the distributed matrix to BlockSolve.
-
Since the definition of the data structure is given in the manual, the
user can explicitly construct it.
-
The user can supply a compressed row storage matrix, with column indices
in the local numbering (section 2.2.2), to the routine BSeasy_A,
which yields the matrix structure.
In the second case, the matrix rows need to be consecutively
numbered. In the first case the assignment of rows over the processors
can be arbitrary; the user has to construct the mapping functions between
local and global numberings. There are example codes illustrating this.
3.3 BPKIT2
BPKIT is a package of block preconditioners, that is, factorisation preconditioners
that treat the matrix first on a subblock level, then recursively factor
the blocks on the element level. One of the selling points of BPKIT is
the object-oriented approach to this two-level factorsation.
3.3.1 Basic information
Available from Web site at http://www.cs.umn.edu/~Echow/bpkit.html/
Author(s) E. Chow and M. A. Heroux
Latest version 2.0, September 1996
Status Maintained
3.3.2 Contents
Iterative methods Flexible GMRES, though this is not the focus of
the package
Preconditioners Block SSOR and ILU, possibly with block fill-in,
and with various methods for explicit and implicit approximation of inverses
of pivot blocks
Data structures Internal, as a C++ class.
Access operations on the data structure Retrieve rows and scalar
information such as the number of nonzeros of the matrix.
Operations using the data structure Multiply and multiply transpose
by the matrix; solve and solve transpose of the preconditioner, both the
whole preconditioner, and the left and right split parts.
Manual Reference manual, 53 pages
Example codes Yes, in Fortran, C, and C++
3.3.3 Installation
The makefile in the app
subdirectory requires editing for the location of MPI and for compiler
options.
The manual is not well written. Many parameters and auxiliary routines
are under-documented.
3.4 GPS: General Purpose Solver
3.4.1 Basic information
Available from
Author(s) Olaf O. Storaasli, Majdi Baddourah, Duc Nguyen
Latest version 03/08/95 (submission to NASA Langley Software
Server)
Status Approval for downloading the software did not come in
in time for this report.
3.5 IML++
The IML++ package consists of C++ implementation of the iterative methods
from the templates project [2]. It relies on
the user supplying Matrix, Vector, and Preconditioner classes that implement
the required operations.
An example implementation of such classes called Sparselib++ is available
from the same authors. It contains uni-processor matrix classes based on
compressed row and column and coordinate storage, a vector class, and Jacobi
and ILU/IC preconditioners for the compressed row and column matrices.
Additionally it contains tools for converting a Harwell-Boeing matrix or
matrix file to these formats.
3.5.1 Basic information
Available from Web site at http://math.nist.gov/iml++/
Author(s) Jack Dongarra, Andrew Lumsdaine, Roldan Pozo and Karin
A. Remington
Latest version 1.2, April 1996
Status Being maintained; IML++ will eventually be superseded
by the Template Numerical Toolkit, a package not currently available.
3.5.2 Contents
Iterative methods BiCG, BiCGstab, CG, CGS, Chebyshev, GMRES, IR,
QMR
Preconditioners n/a
Data structures n/a
Manual Reference guide, 39 pages;
also SparseLib++ Reference Guide, 20 pages.
Example codes no
3.6 Itpack 2C / ItpackV 2D
Itpack is a package of iterative methods. It runs sequentially, but ItpackV
is an optimised version for vector machines such as the Cray Y-MP.
Itpack features adaptive determination of matrix bounds, to be used
in accurate stopping tests, of for the Chebyshev semi-iteration.
3.6.1 Basic information
Available from Ftp site at ftp://ftp.ma.utexas.edu/pub/CNA/ITPACK;
also on Netlib
Author(s) David R. Kincaid, Thomas C. Oppe, David M. Young
Latest version Itpack 2C: manual dated July 1997, Itpack 2D:
manual dated May 1989
Status Being maintained
3.6.2 Contents
Iterative methods Jacobi Conjugate Gradient, Jacobi Semi-iteration
(i.e., Chebyshev iteration, SOR, SSOR, SSOR CG, Reduced system CG, Reduced
system SI
Preconditioners n/a; see above
Data structures Itpack 2C: Compressed Row (there are auxiliary
routines to facilitate building the data structure); ItpackV 2D: ellpack
storage.
Manual 22/14 pages
Example codes Yes
3.6.3 Installation
Itpack There is no makefile or anything like it, but there is really
no need for it either, since a complete installation consists of one file
of library routines and one file of tests.
The PROGRAM statement
in the test files had to be edited.
The test code was insufficiently documented for an easy 'same problem
but larger' test.
Nspcg. The nspcg
routines
come in files with undescriptive names such as nspcg1.f.
The nspcg5.f
file needed replacement of the timer routine.
3.7 Laspack
LASpack is an object-oriented package of iterative methods, iterative methods,multigrid
solvers, and auxiliary routines for the iterative solution of linear systems.
It does not run in parallel.
There are data structures for vectors, general and square matrices,
and preconditioners; a large number of accessing and manipulating these
objects is available.
3.7.1 Basic information
Available from Web site at http://www.math.tu-dresden.de/~skalicky/laspack/index.html;
also from Netlib
Author(s) Tomás Skalický
Latest version 1.12.3, January 1996
Status Developed
3.7.2 Contents
Iterative methods Jacobi, SOR, SSOR, Chebyshev iteration, CG, CGN,
GMRES, BiCG, QMR, CGS, BiCGstab, restarted GMRES.
Multigrid methods Conventional and Full multigrid, BPX preconditioning
Preconditioners Jacobi, SSOR, ILU(0)
Data structures Internal, created by passing elements by function
call.
Access operations on the data structure Many
Operations using the data structure Matrix addition and scaling;
Matrix-vector multiply, transposition, Matrix inverse vector multiply.
Manual Reference manual, 8+40 pages.
Example codes Yes.
3.8 ParPre
This is an add-on package to Petsc; section 3.10. It is solely a collection
of parallel preconditioners, to be used with iterative methods either from
Petsc or coded by the user. The data structures are those used by Petsc.
ParPre can be used independently of Petsc, but does require Petsc to
be installed.
3.8.1 Basic information
Available from Web site at http://www.cs.utk.edu/~eijkhout/
Author(s) Victor Eijkhout and Tony Chan
Latest version 2.0.17
Status Maintained and developed
3.8.2 Contents
Iterative methods None
Preconditioners Additive and multiplicative Schwarz, Generalised
Block SSOR, Schur complement domain decomposition, Algebraic multilevel
methods (including multicolour ILU and algebraic multigrid).
Data structures Internal.
Access operations on the data structure Inherited from Petsc.
Operations using the data structure Solve, no solve transpose.
Manual Reference manual with programming examples; 32 pages.
Example codes Yes
3.8.3 Parallelism and data layout
All methods in ParPre are based on decomposition of the physical domain
into subdomains, with an assignment of one subdomain per processor (or
rather: process). Most methods involve a local solve on the subdomain,
for which any of the non-parallel Petsc preconditioners can be used.
For the methods that do not have all subdomains operate in parallel
(e.g., multiplicative Schwarz as opposed to additive Schwarz), the user
can specify the strategy that determines the sequential ordering of the
domains. The choices are: natural, red-black, and multicolour.
3.9 PCG
The PCG package has a large number of iterative methods and a few simple
preconditioners. The code is publically available in a uni-processo