Next: Storage Modes Up: NSPCG User's Guide Previous: Brief Background on Accelerators

Parameter Arrays IPARM and RPARM

The user must supply default values for the parameters in IPARM and RPARM by inserting the line

       CALL DFAULT (IPARM,RPARM)

in the program before the call to NSPCG. The user may then assign nondefault values to selected quantities in IPARM and RPARM by inserting the appropriate assignment statements before the call to the iterative routine. Important variables in this package which may change adaptively are EMAX and EMIN (eigenvalue estimates of Q^-1A), OMEGA (overrelaxation parameter for the SOR and SSOR methods), ALPHAB and BETAB (SSOR parameters), and SPECR (estimate of the spectral radius of the SOR iteration matrix). The integer vector IPARM and the real vector RPARM allow the user to control certain parameters which affect the performance of the iterative algorithms. Furthermore, these vectors allow the updated parameters from the automatic adaptive procedures to be communicated back to the user. The IPARM and RPARM parameters are described below.

Description of IPARM parameters

IPARM(1)

(NTEST): The stopping test number, in the range 1 to 10, indicates which stopping test should be used to terminate the iteration. (See Section 10 for a description of the available stopping tests.) The SOR accelerator uses a specialized stopping test, so this parameter is ignored when SOR is called unless $\mbox{NTEST}=6$ , in which case the exact stopping test is used. [Default: 2]

IPARM(2)

(ITMAX): On input, ITMAX is the maximum number of iterations allowed. On output, ITMAX is the number of iterations performed. [Default: 100]

IPARM(3)

(LEVEL): LEVEL is the output level control switch. Each higher value provides additional information. [Default: 0]

< 0	no output
= 0	fatal error messages only (default)
= 1	warning messages and minimum output
= 2	reasonable summary
= 3	parameter values and informative comments
= 4	approximate solution after every iteration

IPARM(4)

(NOUT): NOUT is the Fortran unit number for output. [Default: 6]

IPARM(5)

(IDGTS): IDGTS is the error analysis switch. An analysis of the final computed solution is made to determine the accuracy. [Default: 0]

< 0	skip error analysis
= 0	compute DIGIT1 and DIGIT2 and store in RPARM (default)
= 1	print DIGIT1 and DIGIT2
= 2	print final approximate solution vector
= 3	print final approximate residual vector
= 4	print both solution and residual vectors

If LEVEL is less than 1, no printing is done. See explanation of DIGIT1 [= RPARM(7)] and DIGIT2 [= RPARM(8)] for more details.

IPARM(6)

(MAXADP): This parameter is the adaptive procedure switch for EMAX. [Default: 1]

= 0	no adapting on EMAX
= 1	adapting on EMAX (default)

IPARM(7)

(MINADP): This parameter is the adaptive procedure switch for EMIN. [Default: 1]

= 0	no adapting on EMIN
= 1	adapting on EMIN (default)

IPARM(8)

(IOMGAD): This parameter is the adaptive procedure switch for OMEGA. [Default: 1]

= 0	no adapting on OMEGA
= 1	adapting on OMEGA (default)

IPARM(9)

(NS1): NS1 is the number of old vectors to be saved for the truncated acceleration methods. [Default: 5]

IPARM(10)

(NS2): NS2 is the frequency of restarting for the restarted acceleration methods. By default, NS2 is set to a large value so that restarting is not done. [Default: 100000]

IPARM(11)

(NS3): Used only in ORTHOMIN, NS3 denotes the size of the largest Hessenberg matrix to be used to estimate the eigenvalues; it should be set to some value such as 40. [Default: 0]

IPARM(12)

(NSTORE): NSTORE indicates the storage mode used. See Section 8 for a description of the storage modes. [Default: 2]

= 1	Primary format
= 2	Symmetric diagonal format (default)
= 3	Nonsymmetric diagonal format
= 4	Symmetric coordinate format
= 5	Nonsymmetric coordinate format

IPARM(13)

(ISCALE): ISCALE is a switch indicating whether or not the matrix should be scaled before iterating and unscaled after iterating. [Default: 0]

= 0	no scaling (default)
= 1	scaling

If scaling is selected, the matrix is scaled to have a unit diagonal, and u and b are scaled accordingly. If NTEST = 6, $\bar u$ is also scaled. If Au=b is the system to be scaled, and D is the diagonal of A, the scaled system is

$\begin{displaymath}(D^{-\frac{1}{2}}AD^{-\frac{1}{2}}) (D^{\frac{1}{2}}u)=D^{-\frac{1}{2}}b \end{displaymath}$

The diagonal elements of the matrix must be positive if scaling is requested. Scaling of the system causes an extra N elements of WKSP to be used, so scaling is not recommended if storage is a consideration.

IPARM(14)

(IPERM): IPERM is a switch indicating whether or not the matrix should be permuted before iterating and unpermuted after iterating. [Default: 0]

= 0	no permuting (default)
= 1	permuting

If a permutation of the matrix is desired, IPERM must be set to 1, and the vector P of the argument list of NSPCG must contain a coloring vector. See Sections 11 and 12 for more details on coloring vectors. If IPERM = 0, the vector P is ignored and can be dimensioned to be of length one. If IPERM = 1, a permutation vector is generated from P and replaces it, while IP contains the inverse permutation vector on output. If Au=b is the system to be permuted and P is the permutation vector, the permuted system is

(PAP^T) (Pu) = Pb

If NTEST = 6, $\bar u$ is permuted in addition to u and b.

IPARM(15)

(IFACT): IFACT indicates whether a factorization associated with a particular preconditioner should be computed for the current NSPCG call or whether a factorization from a previous NSPCG call should be used. For some preconditioners, the factorization time can be a significant percentage of the iteration time. If one is making a series of calls to NSPCG (as in a time-dependent or nonlinear application) and the coefficient matrix is not changing much, it may be reasonable to amortize the cost of factorization over several NSPCG calls by using a factor from a previous call for the current call. The user must call the same preconditioner for all calls, and the structure of A as indicated by JCOEF cannot be changing. [Default: 1]

1	matrix A is to be factored for the current call (default)
0	matrix A is not to be factored for the current call (previous factorization used)

IPARM(16)

(LVFILL): LVFILL is the level of point or block fill-in to be allowed during the factorization. It affects the preconditioners

ICi	(i = 1,2,3)
MICi	(i = 1,2,3)
BICi	(i = 2,3)
BICXi	(i = 2,3)
MBICi	(i = 2,3)
MBICXi	(i = 2,3)

If LVFILL = 0, no fill-in is allowed beyond the original matrix nonzero pattern. If LVFILL = 1, fill-in is allowed caused by the original nonzero pattern, but no further fill-in caused by these just filled-in elements is allowed. If LVFILL = 2, fill-in is allowed if it is due to the original pattern or LVFILL = 1 filled-in elements. As an example of point fill-in, suppose a symmetric matrix has diagonals at distances of 0, 1, and s. Then the diagonals of the factorization are:

$\begin{displaymath}\begin{array}{cl} \mbox{LVFILL} & \mbox{Diagonals in factori... ...,s-2,s-1,s \\ 4 & 0,1,2,3,s-5,s-4,s-3,s-2,s-1,s \end{array} \end{displaymath}$

In general, if at LVFILL = k, the positive diagonal numbers are

$\begin{displaymath}p_1,p_2, \ldots ,p_m \end{displaymath}$

and the negative diagonal numbers are

$\begin{displaymath}q_1,q_2, \ldots ,q_n \end{displaymath}$

then the diagonal numbers at LVFILL = k+1 are

$\begin{displaymath}p_i + q_j \hspace*{0.5in} (i=1,2, \ldots ,m \mbox{\hspace*{0.1in} and \hspace*{0.1in}} j=1,2, \ldots ,n) \end{displaymath}$

In the example above for LVFILL =0, then p₁=1, p₂=s, q₁=-1, and q₂=-s. Hence for LVFILL =1, the diagonal numbers are 0,1,-1,(s-1),-(s-1),s,-s. For block factorization methods, LVFILL is the level of block fill-in. For example, if a 7-point finite difference stencil is used to discretize a partial differential equation on a 3-dimensional box domain, the resulting matrix can be regarded as a block pentadiagonal matrix with 1-D blocks corresponding to the mesh lines. The block band is sparse and will tend to fill in during a block factorization. If LVFILL is positive, line blocks which are zero in the original matrix will be allowed to fill in with a bandwidth equal the bandwidth of the diagonal pivot blocks, which in turn are determined by LTRUNC. In general, an increase in LVFILL results in a more accurate factorization (and fewer iterations) but at the expense of increased storage requirements and possibly more total time. [Default: 0]

IPARM(17)

(LTRUNC): LTRUNC determines the truncation bandwidth to be used when approximating the inverses of matrices with dense banded matrices. It affects the preconditioners LJACXi, BICi, BICXi, MBICi, MBICXi for i=2,3. If the band matrix whose inverse is being approximated has a half-bandwidth of s, $s+\mbox{LTRUNC}$ will be the half-bandwidth of the approximating inverse. Thus, LTRUNC is the increase in bandwidth to be used for the inverse approximation over the bandwidth of the original matrix. In general, an increase in LTRUNC means a more accurate factorization at the expense of increased storage. [Default: 0]

IPARM(18)

(IPROPA): IPROPA is a flag to indicate whether or not matrix A has Property A. If a matrix has Property A, it is two-cyclic and can be permuted to a red-black matrix. Also, a considerable savings in storage is possible if a factorization preconditioner is called. IPROPA affects the following preconditioners:

ICi	(i=1,2,3,6)
MICi	(i=1,2,3,6)
BICi	(i=2,3,7)
BICXi	(i=2,3,7)
MBICi	(i=2,3,7)
MBICXi	(i=2,3,7)

For the first two preconditioners, IPROPA refers to point Property A. For the last four preconditioners, IPROPA refers to block Property A (i.e., whether the matrix considered as a block matrix has Property A). IPROPA can assume the following values on input:

= 0	if matrix A does not have Property A
= 1	if matrix A has Property A
= 2	if it is not known whether or not matrix A has
	Property A; compute if needed (default)

If IPROPA = 2 and LVFILL = 0 on input, and one of the six methods above is used, it is determined whether or not the matrix has property A, and IPROPA is reset to 0 or 1 accordingly. Determining if matrix A has Property A requires 2N workspace from IWKSP, so it is advantageous for the user to inform NSPCG if it is known beforehand whether or not the matrix has Property A. In general, finite element matrices do not have point Property A. If a 5-point central finite difference stencil is used on a two-dimensional self-adjoint PDE, or if a 7-point central finite difference stencil is used on a three- dimensional self-adjoint PDE, the resulting matrix has Property A. [Default: 2]

IPARM(19)

(KBLSZ): KBLSZ is the 1-D block size. It is used in the line preconditioners. KBLSZ is the largest integer such that, if matrix A is considered as a block matrix, the diagonal blocks have dense bands. [Default: -1]

IPARM(20)

(NBL2D): NBL2D is the 2-D block size. It is used only for the CGCR acceleration, which is applied to 3-D problems on a box domain. [Default: -1]

IPARM(21)

(IFCTV): IFCTV is a switch for indicating whether a scalar or a vectorized routine is to be used for the incomplete factorization of a matrix stored in symmetric or nonsymmetric diagonal storage mode. The vectorized routine should perform better for matrix factorization patterns which have Property A. [Default: 1]

0	use scalar routine
1	use vectorized routine (default)

IPARM(22)

(IQLR): IQLR specifies the orientation of the basic preconditioner. The value of IQLR can be in the range 0 to 3. [Default: 1]

0	no basic preconditioner
1	left preconditioning (default)
2	right preconditioning
3	split preconditioning

IPARM(23)

(ISYMM): ISYMM is a symmetry switch for the matrix. It is used only for the primary format. If the matrix is symmetric, a considerable savings in storage is possible if a factorization preconditioner is called. ISYMM can assume the following values on input:

0	matrix is symmetric
1	matrix is nonsymmetric
2	it is unknown if the matrix is symmetric; NSPCG should determine if the matrix is symmetric or not (default)

If ISYMM = 2 and NSTORE = 1 on input, ISYMM is set to 0 or 1 on output. [Default: 2]

IPARM(24)

(IELIM): IELIM is a switch for effectively removing rows and columns when the diagonal entry is extremely large compared to the nonzero off-diagonal entries in that row. See the description for TOL [= RPARM(15)] for additional details. [Default: 0]

0	test not done
1	test done for removal of rows and columns

IPARM(25)

(NDEG): NDEG specifies the degree of the polynomial preconditioner. [Default: 1]

Description of RPARM parameters

RPARM(1)

(ZETA): ZETA is the stopping test value or approximate relative accuracy desired in the final computer solution. Iteration terminates when the stopping test is less than ZETA. If the method does not converge in ITMAX iterations, ZETA is reset to an estimate of the relative accuracy achieved. [Default: 10^-6]

RPARM(2)

(EMAX): EMAX is an eigenvalue estimate of the preconditioned matrix Q^-1A. In the SPD case, EMAX is an estimate of the largest eigenvalue of Q^-1A. In the nonsymmetric case, EMAX is an estimate of the 2-norm of Q^-1A. EMAX contains on output a final adapted value if MAXADP = 1 and the acceleration allows estimation of eigenvalues. [Default: 2.0]

RPARM(3)

(EMIN): EMIN is an eigenvalue estimate of the preconditioned matrix Q^-1A. In the SPD case, EMIN is an estimate of the smallest eigenvalue of Q^-1A. In the nonsymmetric case, EMIN is an estimate of the 2-norm of the inverse of Q^-1A. EMIN contains on output a final adapted value if MINADP = 1 and the acceleration allows estimation of eigenvalues. [Default: 1.0]

RPARM(4)

(FF): FF is an adaptive procedure damping factor for the estimation of OMEGA for the SSOR methods. Its values lie in the interval (0,1] with 1.0 causing the most frequent parameter changes when IOMGAD = 1 is specified. [Default: 0.75]

RPARM(5)

(FFF): FFF is an adaptive procedure damping factor for changing EMAX and EMIN in the Chebyshev accelerations (SI and SRSI). Its values lie in the interval (0,1] with 1.0 causing the most frequent parameter changes when MAXADP = 1 and MINADP = 1 are specified. [Default: 0.75]

RPARM(6)

(TIMIT): TIMIT on output is the iteration time in seconds of the NSPCG call. The iteration time includes the time to perform all the iterations, including the time to perform the stopping test. [Default: 0.0]

RPARM(7)

(DIGIT1): DIGIT1 is one measure of the approximate number of digits of accuracy of the solution. DIGIT1 is computed as the negative of the logarithm base 10 of the final value of the stopping test. [Default: 0.0]

RPARM(8)

(DIGIT2): DIGIT2 is the approximate number of digits of accuracy using the estimated relative residual with the final approximate solution. DIGIT2 is computed as the negative of the logarithm base 10 of the ratio of the 2-norm of the residual vector and the 2-norm of the right-hand-side vector. This estimate is unrelated to the condition number of the original system and therefore it will not be accurate if the system is ill-conditioned. [Default: 0.0] Note: DIGIT1 is determined from the actual stopping test computed on the final iteration, whereas DIGIT2 is based on the computed residual vector using the final approximate solution after the algorithm has terminated. If these values differ greatly, then either the stopping test has not worked successfully or the original system is ill-conditioned.

RPARM(9)

(OMEGA): OMEGA serves two purposes:

1.: It is the overrelaxation parameter $\omega$ for the SOR and SSOR methods. OMEGA contains on output a final adapted value if IOMGAD = 1 is specified and the acceleration allows estimation of $\omega$ (SOR, SRCG, and SRSI only). Otherwise, OMEGA is not changed and the fixed value of OMEGA is used throughout the iterations.
2.: It can be used in the modified incomplete factorization methods (both point and block) to specify a degree of modification. In the unmodified incomplete factorization method, a factor element is discarded if it results in fill-in outside the prespecified fill-in region (determined by LVFILL). In the modified incomplete factorization method, that factor element is added to the diagonal element of the row in which it would have caused the fill-in. OMEGA can be used to indicate that $\omega*$ (factor element) should be added to the diagonal element instead, where $\omega$ lies in the interval [0,1]. Thus, $\omega=1$ corresponds to full modification and $\omega=0$ corresponds to no modification. This facility is useful, for example, when the IC (ILU) factorization of a matrix exists, but the MIC (MILU) factorization does not. Then a value of $\omega$ between 0 and 1 can be chosen with one of the preconditioners MIC, MBIC, or MBICX to get a stronger factorization than IC, BIC, or BICX, respectively.

[Default: 1.0]

RPARM(10)

(ALPHAB): ALPHAB is an estimate of the minimum eigenvalue of -D^-1(C_L+C_U) where A=D-C_L-C_U for the SSOR methods. ALPHAB contains on output a final estimated value if IOMGAD = 1 is specified and the acceleration allows estimation of ALPHAB (SRCG and SRSI only). ALPHAB only affects the SSOR and LSSOR preconditioners, and is used in the adaptive procedure for $\omega$ . [Default: 0.0]

RPARM(11)

(BETAB): BETAB is an estimate of the maximum eigenvalue of D^-1C_LD^-1C_U where A=D-C_L-C_U for the SSOR methods. BETAB contains on output a final estimated value if IOMGAD = 1 is specified and the acceleration allows estimation of BETAB (SRCG and SRSI only). BETAB only affects the SSOR and LSSOR preconditioners, and is used in the $\omega$ adaptive procedure. [Default: 0.25]

RPARM(12)

(SPECR): SPECR is an estimate of the spectral radius of the SOR iteration matrix. SPECR contains on output a final estimated value if IOMGAD = 1 is specified and the acceleration allows estimation of SPECR (SOR only). SPECR only affects the SOR and LSOR preconditioners, and is used in the $\omega$ adaptive procedure. [Default: 0.0]

RPARM(13)

(TIMFAC): TIMFAC on output is the factorization time in seconds required in the NSPCG call. [Default: 0.0]

RPARM(14)

(TIMTOT): TIMTOT on output is the total time in seconds for the NSPCG call. TIMTOT = TIMFAC + TIMIT + other where ``other" includes scaling and permuting, if requested. [Default: 0.0]

RPARM(15)

(TOL): TOL is a tolerance factor used for eliminating certain equations when IELIM = 1 is selected. In that case, rows are eliminated for which the ratio of the sum of the absolute values of the off-diagonal elements to the absolute value of the diagonal element is small (less than TOL). This is done by dividing the right-hand-side entry for that equation by the diagonal entry, setting the diagonal entry equal to one, and setting the off-diagonal entries of that row to zero. The off-diagonal entries of the corresponding column are also set to zero after correcting the right-hand-side vector. This procedure is useful for linear systems arising from finite element discretizations of partial differential equations in which Dirichlet boundary conditions are handled by penalty methods (giving the diagonal values of the corresponding equations extremely large values). (The installer of this package should set the value of SRELPR. See comments in the Installation Guide in Section 18 for additional details.) [Default: 500*SRELPR]

RPARM(16)

(AINF): AINF is the infinity norm of the matrix A if the LSP preconditioner is used and the infinity norm of $\left(D^{(\pi)}\right)^{-1}A$ if the LLSP preconditioner is used. These preconditioners are only effective if the matrix is SPD or nearly so. If the user does not overwrite the default value, zero, the program attempts to calculate a value for this quantity. [Default: 0.0]

Table 2: Default Values for IPARM Variables
The default values for the IPARM and RPARM variables are given in the tables below.

Position Name Default

IPARM(1) NTEST 2

IPARM(2) ITMAX 100

IPARM(3) LEVEL 0

IPARM(4) NOUT 6

IPARM(5) IDGTS 0

IPARM(6) MAXADP 1

IPARM(7) MINADP 1

IPARM(8) IOMGAD 1

IPARM(9) NS1 5

IPARM(10) NS2 100000

IPARM(11) NS3 0

IPARM(12) NSTORE 2

IPARM(13) ISCALE 0

IPARM(14) IPERM 0

IPARM(15) IFACT 1

IPARM(16) LVFILL 0

IPARM(17) LTRUNC 0

IPARM(18) IPROPA 2

IPARM(19) KBLSZ -1

IPARM(20) NBL2D -1

IPARM(21) IFCTV 1

IPARM(22) IQLR 1

IPARM(23) ISYMM 2

IPARM(24) IELIM 0

IPARM(25) NDEG 1

Table 3: Default Values for RPARM Variables

Position Name Default

RPARM(1) ZETA 10^-6

RPARM(2) EMAX 2.0

RPARM(3) EMIN 1.0

RPARM(4) FF 0.75

RPARM(5) FFF 0.75

RPARM(6) TIMIT 0.0

RPARM(7) DIGIT1 0.0

RPARM(8) DIGIT2 0.0

RPARM(9) OMEGA 1.0

RPARM(10) ALPHAB 0.0

RPARM(11) BETAB 0.25

RPARM(12) SPECR 0.0

RPARM(13) TIMFAC 0.0

RPARM(14) TIMTOT 0.0

RPARM(15) TOL $500*\mbox{SRELPR}$

RPARM(16) AINF 0.0

Next: Storage Modes Up: NSPCG User's Guide Previous: Brief Background on Accelerators