Workspace Requirements

In this section, estimated maximum requirements in NSPCG for real and integer workspace are given. These values should be used initially for the NW and INW parameters in the NSPCG calling sequence, and parameters WKSP and IWKSP should be dimensioned to at least NW and INW, respectively. If insufficient real or integer workspace is provided, a fatal error occurs with IER set to -2 or -3 and NW or INW set to the amount of workspace needed at the point execution terminated. Thus, an easy way to determine storage requirements is to run the package repeatedly using the values of NW and INW suggested by NSPCG from the previous run until enough workspace is supplied. On the other hand, if sufficient real and integer workspace is provided, NW and INW are set on output to the amount of real and integer workspace actually required. Thus, workspace requirements can be reduced to these levels when rerunning the same problem.

Workspace requirements can often be reduced if the user chooses certain IPARM quantities carefully. For the symmetric accelerators, the choice of stopping test, as determined by NTEST [= IPARM(1)], has no effect on workspace requirements. For the nonsymmetric accelerators, some stopping tests can increase storage demands by up to 3N. Thus a stopping test should be chosen which is efficient in time and memory. See Section 10 for information regarding the selection of stopping tests in the nonsymmetric case. Also, 2N integer workspace can be saved if the user informs NSPCG whether or not matrix A has Property A if this property affects a preconditioner. This is done by setting IPROPA [= IPARM(18)] appropriately.

In the following formulas, let

N be the problem size

MAXNZ be the active column width of the COEF array

ITMAX be IPARM(2)

NS1 be IPARM(9)

NS2 be IPARM(10)

LVFILL be IPARM(16)

LTRUNC be IPARM(17)

IPROPA be IPARM(18)

KBLSZ be IPARM(19)

NBL2D be IPARM(20)

ISYMM be IPARM(23)

NV be $\max (1,\min ({\rm NS1}, {\rm NS2}-1))$

NH be $2 + \min ({\rm ITMAX}, {\rm NS2})$

NBLK be N/NBL2D

NFACTI be the number of diagonals in the factorization

NCMAX be the maximum number of nodes of the same color in a

multicolor preconditioner

KEYGS be the gather/scatter switch set in routine DFAULT

(see Section 18 for details on this switch)

DB be the full bandwidth of $D^{(\pi)}$ where $A=D^{(\pi)}+C_L^{(\pi)}+C_U^{(\pi)}$

in block notation

DHB be the half bandwidth of $D^{(\pi)}$

ABAND be the full bandwidth of matrix A, in 2-D blocks

(e.g., ABAND = 3 for a 3-D 7-point operator on a box)

Then

$\begin{eqnarray*}{\rm NW} & = & {\rm NWA}_{\langle{\em accel} \rangle} + {\rm ... ...m precon} \rangle} + {\rm INWF}_{\langle{\em precon} \rangle} \end{eqnarray*}$

As noted above, ${\rm NWS}_{\langle{\em accel} \rangle}=0$ if $\langle$ accel $\rangle$ is CG, SI, SOR, SRCG, or SRSI, and can be up to a maximum of $3{\rm N}$ otherwise. A table giving the real workspace required for each accelerator, ${\rm NWA}_{\langle{\em accel} \rangle}$ , is given below, followed by a table giving the real and integer workspace required for each preconditioner, ${\rm NWP}_{\langle{\em precon} \rangle}$ and ${\rm INWP}_{\langle{\em precon} \rangle}$ . Finally, a table is shown giving real and integer workspace requirements for factorizations for certain preconditioners, ${\rm NWF}_{\langle{\em precon} \rangle}$ and ${\rm INWF}_{\langle{\em precon} \rangle}$ .

Table 4: Values for ${\rm NWA}_{\langle{\em accel} \rangle}$

$\langle$ accel $\rangle$ ${\rm NWA}_{\langle{\em accel} \rangle}$

CG $3{\rm N}+2*{\rm ITMAX}$

SI $4{\rm N}$

SRCG $3{\rm N}+2*{\rm ITMAX}$

SRSI $4{\rm N}$

SOR $2{\rm N}$

BASIC $2{\rm N}$

ME $9{\rm N}$

CGNR $4{\rm N}+2*{\rm ITMAX}$

LSQR $5{\rm N}$

ODIR $(2*{\rm NV}+5)*{\rm N}+{\rm NV}$

OMIN $3*{\rm NV}+2+(2*{\rm NV}+6)*{\rm N}+{\rm NH}* ({\rm NV}+2)+{\rm NH}^2 + 2*{\rm NH}$

ORES $(2*{\rm NV}+4)*{\rm N}+{\rm NV}+1$

IOM $(2*{\rm NS1}+2)*{\rm N}+5*{\rm NS1}+1$

GMRES ${\rm NH}*({\rm NV}+3)+{\rm N}*({\rm NV}+3)+2*({\rm NV}+2)^2 +7*{\rm NV}+17+{\rm NH}^2+2*{\rm NH}$

Add ${\rm N}*({\rm NV}+1)$ if IQLR =2 or 3

Add ${\rm N}*({\rm NV}+3)+{\rm NV}+1$ if ${\rm NS1}<{\rm NS2}-1$

USYMLQ $8{\rm N}+11$

USYMQR $8{\rm N}+14$

LDIR $8{\rm N}$

LMIN $8{\rm N}$

LRES $7{\rm N}$

BCGS $9{\rm N}$

CGCR OMIN workspace plus ${\rm N}+{\rm NBLK}+{\rm NBLK}*{\rm ABAND}$

Table 5: Values for ${\rm NWP}_{\langle{\em precon} \rangle}$ and ${\rm INWP}_{\langle{\em precon} \rangle}$

$\langle$ precon $\rangle$ ${\rm NWP}_{\langle{\em precon} \rangle}$ ${\rm INWP}_{\langle{\em precon} \rangle}$

RICH1 0, N if KEYGS = 1 0

JAC1 0, N if KEYGS = 1 0

SOR1 N 0

SSOR1 N, $2{\rm N}$ if ISYMM = 1 0

IC1 N 0

MIC1 N 0

LSP1 $2{\rm N}$ , $3{\rm N}$ if KEYGS = 1 0

NEU1 N, $2{\rm N}$ if KEYGS = 1 0

RICH2 0 0

JAC2 0 0

LJAC2 0 0

LJACX2 0 0

SOR2 0 MAXNZ

SSOR2 0 MAXNZ

IC2 0 NFACTI

MIC2 0 NFACTI

LSP2 $2{\rm N}$ 0

NEU2 N 0

Table 6: Values for ${\rm NWP}_{\langle{\em precon} \rangle}$ and ${\rm INWP}_{\langle{\em precon} \rangle}$ , cont.

$\langle$ precon $\rangle$ ${\rm NWP}_{\langle{\em precon} \rangle}$ ${\rm INWP}_{\langle{\em precon} \rangle}$

LSOR2 0 0

LSSOR2 N 0

BIC2 KBLSZ 0

MBIC2 KBLSZ 0

BICX2 KBLSZ 0

MBICX2 KBLSZ 0

LLSP2 $2{\rm N}$ 0

LNEU2 $2{\rm N}$ 0

RICH3 0 0

JAC3 0 0

LJAC3 0 0

LJACX3 0 0

SOR3 0 MAXNZ

SSOR3 N MAXNZ

IC3 0 NFACTI

MIC3 0 NFACTI

LSP3 $2{\rm N}$ 0

NEU3 N 0

LSOR3 0 0

LSSOR3 N 0

BIC3 $2*{\rm KBLSZ}$ 0

MBIC3 $2*{\rm KBLSZ}$ 0

BICX3 $2*{\rm KBLSZ}$ 0

MBICX3 $2*{\rm KBLSZ}$ 0

LLSP3 $2{\rm N}$ 0

LNEU3 $2{\rm N}$ 0

RICH4 0, $2{\rm N}$ if KEYGS = 1 0

JAC4 0, $2{\rm N}$ if KEYGS = 1 0

LSP4 $2{\rm N}$ , $4{\rm N}$ if KEYGS = 1 0

NEU4 N, $3{\rm N}$ if KEYGS = 1 0

RICH5 0, $2{\rm N}$ if KEYGS = 1 0

JAC5 0, $2{\rm N}$ if KEYGS = 1 0

LSP5 $2{\rm N}$ , $4{\rm N}$ if KEYGS = 1 0

NEU5 N, $3{\rm N}$ if KEYGS = 1 0

SOR6 N 0

SSOR6 ${\rm N}+{\rm NCMAX}$ 0

IC6 N 0

MIC6 N 0

RS6 $2{\rm N}$ 0

SOR7 0 0

SSOR7 N 0

BIC7 $2*{\rm NCMAX}$ 0

MBIC7 $2*{\rm NCMAX}$ 0

BICX7 $2*{\rm NCMAX}$ 0

MBICX7 $2*{\rm NCMAX}$ 0

RS7 N 0

Table 7: Values for ${\rm NWF}_{\langle{\em precon} \rangle}$ and ${\rm INWF}_{\langle{\em precon} \rangle}$

$\langle$ precon $\rangle$ Case ${\rm NWF}_{\langle{\em precon} \rangle}$ ${\rm INWF}_{\langle{\em precon} \rangle}$

IC1, IPROPA = 1, LVFILL = 0 N 0

MIC1 IPROPA = 0, LVFILL = 0, ISYMM = 0 $\frac{1}{2}{\rm N}*{\rm MAXNZ}$ 0

IPROPA = 0, LVFILL = 0, ISYMM = 1 ${\rm N}*{\rm MAXNZ}$ 0

LVFILL > 0, ISYMM = 0 $>\frac{1}{2}{\rm N}*{\rm MAXNZ}$ $>\frac{1}{2}{\rm N}*{\rm MAXNZ}$

LVFILL > 0, ISYMM = 1 $>{\rm N}*{\rm MAXNZ}$ $>{\rm N}*{\rm MAXNZ}$

IPROPA = 2 0 $\max (2{\rm N},{\rm above})$

LJAC2 all cases ${\rm N}*{\rm DHB}$ 0

LJACX2 all cases ${\rm N}*({\rm DHB}+{\rm LTRUNC})$ 0

IC2, IPROPA = 1, LVFILL = 0 N ${\rm MAXNZ}+{\rm MAXNZ}^2$

MIC2 IPROPA = 0, LVFILL = 0 ${\rm N}*{\rm MAXNZ}$ ${\rm MAXNZ}+{\rm MAXNZ}^2$

LVFILL > 0 $>{\rm N}*{\rm MAXNZ}$ $>{\rm MAXNZ}^2$

IPROPA = 2 0 $\max (2{\rm N},{\rm above})$

LSOR2 all cases ${\rm N}*{\rm DHB}$ $3({\rm MAXNZ}+1)$

LSSOR2 all cases ${\rm N}*{\rm DHB}$ $3({\rm MAXNZ}+1)$

BIC2, IPROPA = 1, LVFILL = 0 ${\rm N}*({\rm DHB}+{\rm LTRUNC})$ $3({\rm MAXNZ}+1)$

MBIC2, IPROPA = 0, LVFILL = 0, LTRUNC = 0 ${\rm N}*{\rm MAXNZ}$ $3({\rm MAXNZ}+1)$

BICX2, otherwise $>{\rm N}*{\rm MAXNZ}$ $3({\rm MAXNZ}+1)$

MBICX2 IPROPA = 2 0 $\max (2\frac{{\rm N}}{{\rm KBLSZ}},{\rm above})$

LLSP2 all cases ${\rm N}*{\rm DHB}$ 0

LNEU2 all cases ${\rm N}*{\rm DHB}$ 0

LJAC3 all cases ${\rm N}*{\rm DB}$ 0

LJACX3 all cases ${\rm N}*({\rm DB}+2{\rm LTRUNC})$ 0

IC3, IPROPA = 1, LVFILL = 0 N ${\rm MAXNZ}+{\rm MAXNZ}^2$

MIC3 IPROPA = 0, LVFILL = 0 ${\rm N}*{\rm MAXNZ}$ ${\rm MAXNZ}+{\rm MAXNZ}^2$

LVFILL > 0 $>{\rm N}*{\rm MAXNZ}$ $>{\rm MAXNZ}^2$

IPROPA = 2 0 $\max (2{\rm N},{\rm above})$

LSOR3 all cases ${\rm N}*{\rm DB}$ $3({\rm MAXNZ}+1)$

LSSOR3 all cases ${\rm N}*{\rm DB}$ $3({\rm MAXNZ}+1)$

BIC3, IPROPA = 1, LVFILL = 0 ${\rm N}*({\rm DB}+2{\rm LTRUNC})$ $3({\rm MAXNZ}+1)$

MBIC3, IPROPA = 0, LVFILL = 0, LTRUNC = 0 ${\rm N}*{\rm MAXNZ}$ $3({\rm MAXNZ}+1)$

BICX3, otherwise $>{\rm N}*{\rm MAXNZ}$ $3({\rm MAXNZ}+1)$

MBICX3 IPROPA = 2 0 $\max (2\frac{{\rm N}}{{\rm KBLSZ}},{\rm above})$

LLSP3 all cases ${\rm N}*{\rm DB}$ 0

LNEU3 all cases ${\rm N}*{\rm DB}$ 0

IC6, IPROPA = 1, LVFILL = 0 N 0

MIC6 IPROPA = 0, LVFILL = 0 ${\rm N}*{\rm MAXNZ}$ 0

LVFILL > 0 not allowed

IPROPA = 2 0 $\max (2{\rm N},{\rm above})$

SOR7 all cases ${\rm N}*{\rm DB}$

SSOR7 all cases ${\rm N}*{\rm DB}$

BIC7, IPROPA = 1 ${\rm N}*({\rm DB}+2{\rm LTRUNC})$

MBIC7, IPROPA = 0, LTRUNC = 0 ${\rm N}*{\rm MAXNZ}$

BICX7, else $>{\rm N}*{\rm MAXNZ}$

MBICX7

RS7 all cases ${\rm N}*{\rm DB}$

others all cases 0 0

Next: Stopping Tests Up: NSPCG User's Guide Previous: Storage Modes

N	be the problem size
MAXNZ	be the active column width of the COEF array
ITMAX	be IPARM(2)
NS1	be IPARM(9)
NS2	be IPARM(10)
LVFILL	be IPARM(16)
LTRUNC	be IPARM(17)
IPROPA	be IPARM(18)
KBLSZ	be IPARM(19)
NBL2D	be IPARM(20)
ISYMM	be IPARM(23)
NV	be $\max (1,\min ({\rm NS1}, {\rm NS2}-1))$
NH	be $2 + \min ({\rm ITMAX}, {\rm NS2})$
NBLK	be N/NBL2D
NFACTI	be the number of diagonals in the factorization
NCMAX	be the maximum number of nodes of the same color in a
	multicolor preconditioner
KEYGS	be the gather/scatter switch set in routine DFAULT
	(see Section 18 for details on this switch)
DB	be the full bandwidth of $D^{(\pi)}$ where $A=D^{(\pi)}+C_L^{(\pi)}+C_U^{(\pi)}$
	in block notation
DHB	be the half bandwidth of $D^{(\pi)}$
ABAND	be the full bandwidth of matrix A, in 2-D blocks
	(e.g., ABAND = 3 for a 3-D 7-point operator on a box)