imsl fortran math library

imsl fortran math library
IMSL® FORTRAN MATH LIBRARY
Version 7.1.0
ROGUE WAVE SOFTWARE
/ 5500 FLATIRON PARKWAY, SUITE 200
/
BOULDER, CO 80301, USA
/
WWW.ROGUEWAVE.COM
© 1970-2014 Rogue Wave Software, Visual Numerics, IMSL and PV-WAVE are registered trademarks of Rogue
Wave Software, Inc. in the U.S. and other countries. JMSL, JWAVE, TS-WAVE, PyIMSL are trademarks of Rogue
Wave Software, Inc. or its subsidiaries. All other company, product or brand names are the property of their
respective owners.
IMPORTANT NOTICE: Information contained in this documentation is subject to change without notice. Use
of this document is subject to the terms and conditions of a Rogue Wave Software License Agreement,
including, without limitation, the Limited Warranty and Limitation of Liability. If you do not accept the terms
of the license agreement, you may not use this documentation and should promptly return the product for a
full refund. This documentation may not be copied or distributed in any form without the express written
consent of Rogue Wave.
ACKNOWLEDGMENTS
This documentation, and the information contained herein (the "Documentation"), contains proprietary information of Rogue Wave Software,
Inc. Any reproduction, disclosure, modification, creation of derivative works from, license, sale, or other transfer of the Documentation without the express written consent of Rogue Wave Software, Inc., is strictly prohibited. The Documentation may contain technical inaccuracies or
typographical errors. Use of the Documentation and implementation of any of its processes or techniques are the sole responsibility of the
client, and Rogue Wave Software, Inc., assumes no responsibility and will not be liable for any errors, omissions, damage, or loss that might
result from any use or misuse of the Documentation
ROGUE WAVE SOFTWARE, INC., MAKES NO REPRESENTATION ABOUT THE SUITABILITY OF THE DOCUMENTATION. THE DOCUMENTATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. ROGUE WAVE
SOFTWARE, INC., HEREBY DISCLAIMS ALL WARRANTIES AND CONDITIONS WITH REGARD TO THE DOCUMENTATION, WHETHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, INCLUDING WITHOUT LIMITATION ANY
IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NONINFRINGEMENT. IN NO EVENT SHALL ROGUE WAVE SOFTWARE, INC., BE LIABLE, WHETHER IN CONTRACT, TORT, OR
OTHERWISE, FOR ANY SPECIAL, CONSEQUENTIAL, INDIRECT, PUNITIVE, OR EXEMPLARY DAMAGES IN CONNECTION WITH THE USE OF THE DOCUMENTATION.
The Documentation is subject to change at any time without notice.
Rogue Wave Software, Inc.
Address: 5500 Flatiron Parkway, Boulder, CO 80301 USA
Product Information:
Fax:
Web:
(303) 473-9118 (800) 487-3217
(303) 473-9137
http://www.roguewave.com
ROGUEWAVE.COM
Contents
Introduction
1
The IMSL Fortran Numerical Library .................................................................................1
User Background .................................................................................................................2
Getting Started .....................................................................................................................3
Finding the Right Routine....................................................................................................3
Organization of the Documentation.....................................................................................4
Naming Conventions ...........................................................................................................5
Using Library Subprograms.................................................................................................6
Programming Conventions ..................................................................................................6
Module Usage ......................................................................................................................7
Using MPI Routines.............................................................................................................7
Programming Tips ...............................................................................................................9
Optional Subprogram Arguments ........................................................................................9
Optional Data .....................................................................................................................10
Overloaded =, /=, etc., for Derived Types .........................................................................11
Error Handling ...................................................................................................................12
Printing Results..................................................................................................................13
Fortran 90 Constructs.........................................................................................................13
Shared-Memory Multiprocessors and Thread Safety ........................................................13
Using Operators and Generic Functions ............................................................................14
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK ...........................................16
Using ScaLAPACK Enhanced Routines ...........................................................................20
1Chapter 1: Linear Systems
23
Routines .............................................................................................................................23
Usage Notes ......................................................................................................................28
LIN_SOL_GEN .................................................................................................................33
LIN_SOL_SELF ................................................................................................................42
LIN_SOL_LSQ..................................................................................................................52
LIN_SOL_SVD .................................................................................................................61
LIN_SOL_TRI...................................................................................................................70
LIN_SVD...........................................................................................................................83
ROGUEWAVE.COM
Contents
iii
Parallel Constrained Least-Squares Solvers ......................................................................92
PARALLEL_NONNEGATIVE_LSQ...............................................................................93
PARALLEL_BOUNDED_LSQ......................................................................................101
LSARG ............................................................................................................................109
LSLRG.............................................................................................................................114
LFCRG ............................................................................................................................120
LFTRG.............................................................................................................................126
LFSRG .............................................................................................................................131
LFIRG..............................................................................................................................136
LFDRG ............................................................................................................................141
LINRG .............................................................................................................................143
LSACG ............................................................................................................................147
LSLCG.............................................................................................................................152
LFCCG.............................................................................................................................157
LFTCG.............................................................................................................................163
LFSCG .............................................................................................................................168
LFICG..............................................................................................................................173
LFDCG ............................................................................................................................178
LINCG .............................................................................................................................180
LSLRT .............................................................................................................................185
LFCRT .............................................................................................................................189
LFDRT.............................................................................................................................193
LINRT..............................................................................................................................195
LSLCT .............................................................................................................................197
LFCCT .............................................................................................................................201
LFDCT.............................................................................................................................205
LINCT..............................................................................................................................207
LSADS.............................................................................................................................209
LSLDS .............................................................................................................................214
LFCDS .............................................................................................................................219
LFTDS .............................................................................................................................224
LFSDS .............................................................................................................................229
LFIDS ..............................................................................................................................234
LFDDS.............................................................................................................................239
LINDS..............................................................................................................................241
LSASF .............................................................................................................................245
LSLSF..............................................................................................................................248
LFCSF..............................................................................................................................251
LFTSF..............................................................................................................................254
LFSSF ..............................................................................................................................257
LFISF ...............................................................................................................................260
LFDSF .............................................................................................................................263
LSADH ............................................................................................................................265
LSLDH.............................................................................................................................270
LFCDH ............................................................................................................................275
ROGUEWAVE.COM
Contents
iv
LFTDH.............................................................................................................................281
LFSDH.............................................................................................................................286
LFIDH..............................................................................................................................291
LFDDH ............................................................................................................................297
LSAHF.............................................................................................................................299
LSLHF .............................................................................................................................302
LFCHF .............................................................................................................................305
LFTHF .............................................................................................................................308
LFSHF .............................................................................................................................311
LFIHF ..............................................................................................................................314
LFDHF.............................................................................................................................317
LSLTR .............................................................................................................................319
LSLCR .............................................................................................................................321
LSARB ............................................................................................................................324
LSLRB ............................................................................................................................327
LFCRB ............................................................................................................................332
LFTRB ............................................................................................................................336
LFSRB ............................................................................................................................339
LFIRB .............................................................................................................................342
LFDRB.............................................................................................................................345
LSAQS ............................................................................................................................347
LSLQS ............................................................................................................................350
LSLPB ............................................................................................................................353
LFCQS ............................................................................................................................356
LFTQS ............................................................................................................................359
LFSQS ............................................................................................................................362
LFIQS ..............................................................................................................................365
LFDQS ............................................................................................................................368
LSLTQ ............................................................................................................................370
LSLCQ.............................................................................................................................372
LSACB ............................................................................................................................375
LSLCB ............................................................................................................................378
LFCCB.............................................................................................................................381
LFTCB ............................................................................................................................384
LFSCB .............................................................................................................................387
LFICB ..............................................................................................................................390
LFDCB.............................................................................................................................394
LSAQH ............................................................................................................................396
LSLQH.............................................................................................................................399
LSLQB ............................................................................................................................402
LFCQH ............................................................................................................................405
LFTQH.............................................................................................................................408
LFSQH.............................................................................................................................411
LFIQH..............................................................................................................................414
LFDQH ............................................................................................................................417
ROGUEWAVE.COM
Contents
v
LSLXG.............................................................................................................................419
LFTXG.............................................................................................................................424
LFSXG.............................................................................................................................429
LSLZG .............................................................................................................................432
LFTZG .............................................................................................................................437
LFSZG .............................................................................................................................442
LSLXD.............................................................................................................................446
LSCXD ............................................................................................................................450
LNFXD ............................................................................................................................454
LFSXD.............................................................................................................................459
LSLZD .............................................................................................................................463
LNFZD.............................................................................................................................467
LFSZD .............................................................................................................................471
LSLTO .............................................................................................................................475
LSLTC .............................................................................................................................477
LSLCC .............................................................................................................................479
PCGRC ............................................................................................................................482
JCGRC .............................................................................................................................488
GMRES............................................................................................................................491
ARPACK_SVD ...............................................................................................................502
LSQRR.............................................................................................................................503
LQRRV............................................................................................................................509
LSBRR.............................................................................................................................516
LCLSQ.............................................................................................................................519
LQRRR ............................................................................................................................523
LQERR ............................................................................................................................530
LQRSL.............................................................................................................................535
LUPQR ............................................................................................................................542
LCHRG............................................................................................................................546
LUPCH ............................................................................................................................549
LDNCH............................................................................................................................552
LSVRR.............................................................................................................................556
LSVCR.............................................................................................................................563
LSGRR.............................................................................................................................567
2Chapter 2: Eigensystem Analysis
573
Routines ...........................................................................................................................573
Usage Notes .....................................................................................................................575
Generalized Eigenvalue Problems ...................................................................................579
Using ARPACK for Ordinary and Generalized Eigenvalue Problems ...........................580
LIN_EIG_SELF...............................................................................................................581
LIN_EIG_GEN................................................................................................................588
LIN_GEIG_GEN .............................................................................................................597
EVLRG ............................................................................................................................605
EVCRG............................................................................................................................608
ROGUEWAVE.COM
Contents
vi
EPIRG ..............................................................................................................................611
EVLCG ............................................................................................................................613
EVCCG ............................................................................................................................616
EPICG ..............................................................................................................................619
EVLSF .............................................................................................................................621
EVCSF .............................................................................................................................623
EVASF.............................................................................................................................626
EVESF .............................................................................................................................628
EVBSF .............................................................................................................................631
EVFSF .............................................................................................................................634
EPISF ...............................................................................................................................637
EVLSB.............................................................................................................................639
EVCSB.............................................................................................................................641
EVASB ............................................................................................................................644
EVESB.............................................................................................................................647
EVBSB.............................................................................................................................650
EVFSB .............................................................................................................................653
EPISB...............................................................................................................................656
EVLHF.............................................................................................................................658
EVCHF ............................................................................................................................661
EVAHF ............................................................................................................................664
EVEHF.............................................................................................................................667
EVBHF ............................................................................................................................670
EVFHF.............................................................................................................................673
EPIHF ..............................................................................................................................676
EVLRH ............................................................................................................................678
EVCRH ............................................................................................................................680
EVLCH ............................................................................................................................683
EVCCH ............................................................................................................................685
GVLRG............................................................................................................................688
GVCRG ...........................................................................................................................691
GPIRG .............................................................................................................................695
GVLCG............................................................................................................................697
GVCCG ...........................................................................................................................700
GPICG .............................................................................................................................703
GVLSP.............................................................................................................................705
GVCSP.............................................................................................................................708
GPISP...............................................................................................................................711
Eigenvalues and Eigenvectors Computed with ARPACK ..............................................713
The Base Class ARPACKBASE .....................................................................................715
ARPACK_SYMMETRIC ...............................................................................................716
ARPACK_SVD ...............................................................................................................731
ARPACK_NONSYMMETRIC.......................................................................................739
ARPACK_COMPLEX ....................................................................................................747
ROGUEWAVE.COM
Contents
vii
3Chapter 3: Interpolation and Approximation
755
Routines ...........................................................................................................................755
Usage Notes .....................................................................................................................758
SPLINE_CONSTRAINTS ..............................................................................................765
SPLINE_VALUES ..........................................................................................................766
SPLINE_FITTING ..........................................................................................................768
SURFACE_CONSTRAINTS..........................................................................................778
SURFACE_VALUES......................................................................................................779
SURFACE_FITTING......................................................................................................781
CSIEZ ..............................................................................................................................792
CSINT..............................................................................................................................795
CSDEC.............................................................................................................................798
CSHER.............................................................................................................................803
CSAKM ...........................................................................................................................806
CSCON ............................................................................................................................809
CSPER .............................................................................................................................813
CSVAL ............................................................................................................................816
CSDER.............................................................................................................................817
CS1GD.............................................................................................................................820
CSITG..............................................................................................................................823
SPLEZ..............................................................................................................................826
BSINT..............................................................................................................................830
BSNAK............................................................................................................................834
BSOPK.............................................................................................................................837
BS2IN ..............................................................................................................................840
BS3IN ..............................................................................................................................845
BSVAL ............................................................................................................................851
BSDER.............................................................................................................................853
BS1GD.............................................................................................................................856
BSITG..............................................................................................................................859
BS2VL .............................................................................................................................862
BS2DR .............................................................................................................................864
BS2GD.............................................................................................................................868
BS2IG ..............................................................................................................................872
BS3VL .............................................................................................................................876
BS3DR .............................................................................................................................878
BS3GD.............................................................................................................................882
BS3IG ..............................................................................................................................887
BSCPP .............................................................................................................................891
PPVAL.............................................................................................................................893
PPDER .............................................................................................................................896
PP1GD .............................................................................................................................899
PPITG ..............................................................................................................................902
QDVAL ...........................................................................................................................905
QDDER............................................................................................................................907
ROGUEWAVE.COM
Contents
viii
QD2VL ............................................................................................................................910
QD2DR ............................................................................................................................913
QD3VL ............................................................................................................................917
QD3DR ............................................................................................................................920
SURF ...............................................................................................................................925
SURFND..........................................................................................................................929
RLINE..............................................................................................................................933
RCURV............................................................................................................................936
FNLSQ.............................................................................................................................940
BSLSQ .............................................................................................................................945
BSVLS .............................................................................................................................949
CONFT ............................................................................................................................954
BSLS2 ..............................................................................................................................964
BSLS3 ..............................................................................................................................969
CSSED .............................................................................................................................975
CSSMH ............................................................................................................................979
CSSCV.............................................................................................................................982
RATCH ............................................................................................................................985
4Chapter 4: Integration and Differentiation
989
Routines ...........................................................................................................................989
Usage Notes .....................................................................................................................991
QDAGS............................................................................................................................995
QDAG ..............................................................................................................................998
QDAGP..........................................................................................................................1002
QDAG1D .......................................................................................................................1006
QDAGI...........................................................................................................................1012
QDAWO ........................................................................................................................1015
QDAWF.........................................................................................................................1019
QDAWS.........................................................................................................................1023
QDAWC ........................................................................................................................1026
QDNG ............................................................................................................................1029
TWODQ.........................................................................................................................1032
QDAG2D .......................................................................................................................1037
QDAG3D .......................................................................................................................1043
QAND ............................................................................................................................1049
QMC ..............................................................................................................................1052
GQRUL..........................................................................................................................1055
GQRCF ..........................................................................................................................1059
RECCF...........................................................................................................................1062
RECQR ..........................................................................................................................1065
FQRUL ..........................................................................................................................1068
DERIV ...........................................................................................................................1072
ROGUEWAVE.COM
Contents
ix
5Chapter 5: Differential Equations
1077
Routines .........................................................................................................................1077
Usage Notes ...................................................................................................................1079
IVPRK ...........................................................................................................................1083
IVMRK ..........................................................................................................................1091
IVPAG ...........................................................................................................................1101
BVPFD...........................................................................................................................1117
BVPMS..........................................................................................................................1129
DAESL...........................................................................................................................1136
DASPG ..........................................................................................................................1151
IVOAM..........................................................................................................................1152
Introduction to Subroutine PDE_1D_MG .....................................................................1159
PDE_1D_MG.................................................................................................................1161
MMOLCH .....................................................................................................................1192
MOLCH .........................................................................................................................1205
FEYNMAN_KAC .........................................................................................................1206
HQSVAL .......................................................................................................................1263
FPS2H............................................................................................................................1267
FPS3H............................................................................................................................1273
SLEIG ............................................................................................................................1280
SLCNT...........................................................................................................................1292
6Chapter 6: Transforms
1295
Routines .........................................................................................................................1295
Usage Notes ...................................................................................................................1297
FAST_DFT ....................................................................................................................1300
FAST_2DFT ..................................................................................................................1307
FAST_3DFT ..................................................................................................................1313
FFTRF............................................................................................................................1317
FFTRB ...........................................................................................................................1321
FFTRI.............................................................................................................................1325
FFTCF............................................................................................................................1328
FFTCB ...........................................................................................................................1331
FFTCI.............................................................................................................................1334
FSINT ............................................................................................................................1337
FSINI .............................................................................................................................1339
FCOST ...........................................................................................................................1341
FCOSI ............................................................................................................................1343
QSINF............................................................................................................................1345
QSINB ...........................................................................................................................1347
QSINI.............................................................................................................................1349
QCOSF...........................................................................................................................1351
QCOSB ..........................................................................................................................1353
QCOSI ...........................................................................................................................1355
ROGUEWAVE.COM
Contents
x
FFT2D............................................................................................................................1357
FFT2B ............................................................................................................................1361
FFT3F ............................................................................................................................1365
FFT3B ............................................................................................................................1369
RCONV .........................................................................................................................1374
CCONV .........................................................................................................................1379
RCORL ..........................................................................................................................1384
CCORL ..........................................................................................................................1389
INLAP............................................................................................................................1394
SINLP ............................................................................................................................1397
7Chapter 7: Nonlinear Equations
1403
Routines .........................................................................................................................1403
Usage Notes ...................................................................................................................1404
ZPLRC ...........................................................................................................................1405
ZPORC...........................................................................................................................1407
ZPOCC...........................................................................................................................1409
ZANLY ..........................................................................................................................1411
ZUNI ..............................................................................................................................1414
ZBREN ..........................................................................................................................1417
ZREAL...........................................................................................................................1420
NEQNF ..........................................................................................................................1423
NEQNJ...........................................................................................................................1426
NEQBF ..........................................................................................................................1430
NEQBJ ...........................................................................................................................1436
8Chapter 8: Optimization
1443
Routines .........................................................................................................................1443
Usage Notes ...................................................................................................................1445
UVMIF...........................................................................................................................1449
UVMID ..........................................................................................................................1452
UVMGS .........................................................................................................................1456
UMINF...........................................................................................................................1459
UMING ..........................................................................................................................1465
UMIDH ..........................................................................................................................1471
UMIAH ..........................................................................................................................1476
UMCGF .........................................................................................................................1482
UMCGG.........................................................................................................................1486
UMPOL .........................................................................................................................1490
UNLSF...........................................................................................................................1494
UNLSJ ...........................................................................................................................1500
BCONF ..........................................................................................................................1506
BCONG .........................................................................................................................1513
BCODH .........................................................................................................................1520
ROGUEWAVE.COM
Contents
xi
BCOAH .........................................................................................................................1526
BCPOL...........................................................................................................................1533
BCLSF ...........................................................................................................................1537
BCLSJ............................................................................................................................1544
BCNLS...........................................................................................................................1551
READ_MPS...................................................................................................................1560
MPS_FREE....................................................................................................................1570
DENSE_LP....................................................................................................................1573
DLPRS ...........................................................................................................................1578
SLPRS............................................................................................................................1582
TRAN.............................................................................................................................1588
QPROG..........................................................................................................................1591
LCONF ..........................................................................................................................1595
LCONG..........................................................................................................................1601
NNLPF...........................................................................................................................1607
NNLPG ..........................................................................................................................1613
CDGRD .........................................................................................................................1621
FDGRD..........................................................................................................................1624
FDHES...........................................................................................................................1627
GDHES ..........................................................................................................................1630
DDJAC...........................................................................................................................1633
FDJAC ...........................................................................................................................1642
CHGRD .........................................................................................................................1645
CHHES ..........................................................................................................................1649
CHJAC...........................................................................................................................1653
GGUES ..........................................................................................................................1657
9Chapter 9: Basic Matrix/Vector Operations
1661
Routines .........................................................................................................................1661
Basic Linear Algebra Subprograms...............................................................................1665
Programming Notes for BLAS Using NVIDIA ............................................................1691
CUBLAS_GET..............................................................................................................1699
CUBLAS_SET...............................................................................................................1701
CHECK_BUFFER_ALLOCATION .............................................................................1703
CUDA_ERROR_PRINT ...............................................................................................1704
Other Matrix/Vector Operations....................................................................................1706
CRGRG..........................................................................................................................1707
CCGCG..........................................................................................................................1709
CRBRB ..........................................................................................................................1711
CCBCB ..........................................................................................................................1713
CRGRB..........................................................................................................................1715
CRBRG..........................................................................................................................1717
CCGCB..........................................................................................................................1719
CCBCG..........................................................................................................................1721
CRGCG..........................................................................................................................1723
ROGUEWAVE.COM
Contents
xii
CRRCR ..........................................................................................................................1725
CRBCB ..........................................................................................................................1727
CSFRG...........................................................................................................................1729
CHFCG ..........................................................................................................................1731
CSBRB...........................................................................................................................1733
CHBCB ..........................................................................................................................1735
TRNRR ..........................................................................................................................1737
MXTXF .........................................................................................................................1739
MXTYF .........................................................................................................................1741
MXYTF .........................................................................................................................1744
MRRRR .........................................................................................................................1746
MCRCR .........................................................................................................................1749
HRRRR ..........................................................................................................................1751
BLINF ............................................................................................................................1753
POLRG ..........................................................................................................................1755
MURRV.........................................................................................................................1758
MURBV.........................................................................................................................1760
MUCRV.........................................................................................................................1762
MUCBV.........................................................................................................................1764
ARBRB ..........................................................................................................................1766
ACBCB ..........................................................................................................................1768
NRIRR ...........................................................................................................................1770
NR1RR...........................................................................................................................1772
NR2RR...........................................................................................................................1774
NR1RB...........................................................................................................................1776
NR1CB...........................................................................................................................1778
DISL2.............................................................................................................................1780
DISL1.............................................................................................................................1782
DISLI .............................................................................................................................1784
VCONR .........................................................................................................................1786
VCONC .........................................................................................................................1789
Extended Precision Arithmetic ......................................................................................1792
10Chapter 10: Linear Algebra Operators and Generic Functions
1795
Routines .........................................................................................................................1795
Usage Notes ...................................................................................................................1797
Matrix Optional Data Changes ......................................................................................1798
Dense Matrix Computations ..........................................................................................1800
Dense Matrix Functions.................................................................................................1802
Dense Matrix Parallelism Using MPI ............................................................................1803
Sparse Matrix Computations .........................................................................................1807
.x. ...................................................................................................................................1813
.tx. ..................................................................................................................................1818
ROGUEWAVE.COM
Contents
xiii
.xt. ..................................................................................................................................1822
.hx. .................................................................................................................................1826
.xh. .................................................................................................................................1830
.t. ....................................................................................................................................1834
.h. ...................................................................................................................................1837
.i. ....................................................................................................................................1839
.ix. ..................................................................................................................................1842
.xi. ..................................................................................................................................1854
CHOL.............................................................................................................................1858
COND ............................................................................................................................1861
DET................................................................................................................................1866
DIAG .............................................................................................................................1869
DIAGONALS ................................................................................................................1871
EIG.................................................................................................................................1873
EYE................................................................................................................................1877
FFT.................................................................................................................................1879
FFT_BOX ......................................................................................................................1881
IFFT ...............................................................................................................................1884
IFFT_BOX.....................................................................................................................1886
isNaN .............................................................................................................................1889
NaN................................................................................................................................1890
NORM ...........................................................................................................................1892
ORTH.............................................................................................................................1895
RAND ............................................................................................................................1899
RANK ............................................................................................................................1901
SVD ...............................................................................................................................1903
UNIT..............................................................................................................................1906
11Chapter 11: Utilities
1909
Routines .........................................................................................................................1909
Usage Notes for ScaLAPACK Utilities.........................................................................1912
ScaLAPACK_SETUP ...................................................................................................1916
ScaLAPACK_GETDIM ................................................................................................1918
ScaLAPACK_READ.....................................................................................................1919
ScaLAPACK_WRITE ...................................................................................................1921
ScaLAPACK_MAP .......................................................................................................1930
ScaLAPACK_UNMAP .................................................................................................1932
ScaLAPACK_EXIT.......................................................................................................1935
ERROR_POST ..............................................................................................................1936
SHOW............................................................................................................................1939
WRRRN.........................................................................................................................1943
WRRRL .........................................................................................................................1945
WRIRN ..........................................................................................................................1948
WRIRL...........................................................................................................................1950
WRCRN.........................................................................................................................1953
ROGUEWAVE.COM
Contents
xiv
WRCRL .........................................................................................................................1956
WROPT .........................................................................................................................1960
PGOPT...........................................................................................................................1966
PERMU..........................................................................................................................1968
PERMA..........................................................................................................................1970
SORT_REAL.................................................................................................................1973
SVRGN ..........................................................................................................................1976
SVRGP...........................................................................................................................1978
SVIGN ...........................................................................................................................1980
SVIGP ............................................................................................................................1982
SVRBN ..........................................................................................................................1984
SVRBP...........................................................................................................................1986
SVIBN ...........................................................................................................................1988
SVIBP ............................................................................................................................1990
SRCH .............................................................................................................................1992
ISRCH............................................................................................................................1995
SSRCH...........................................................................................................................1997
ACHAR .........................................................................................................................2000
IACHAR ........................................................................................................................2002
ICASE ............................................................................................................................2003
IICSR .............................................................................................................................2005
IIDEX.............................................................................................................................2007
CVTSI ............................................................................................................................2009
CPSEC ...........................................................................................................................2010
TIMDY ..........................................................................................................................2011
TDATE ..........................................................................................................................2013
NDAYS..........................................................................................................................2014
NDYIN...........................................................................................................................2016
IDYWK..........................................................................................................................2018
VERML .........................................................................................................................2020
RAND_GEN ..................................................................................................................2022
RNGET ..........................................................................................................................2029
RNSET...........................................................................................................................2030
RNOPT ..........................................................................................................................2032
RNIN32..........................................................................................................................2034
RNGE32.........................................................................................................................2035
RNSE32 .........................................................................................................................2037
RNIN64..........................................................................................................................2038
RNGE64.........................................................................................................................2039
RNSE64 .........................................................................................................................2041
RNUNF ..........................................................................................................................2042
RNUN ............................................................................................................................2044
FAURE_INIT ................................................................................................................2046
FAURE_FREE...............................................................................................................2047
FAURE_NEXT..............................................................................................................2048
ROGUEWAVE.COM
Contents
xv
IUMAG..........................................................................................................................2051
UMAG ...........................................................................................................................2054
DUMAG ........................................................................................................................2056
PLOTP ...........................................................................................................................2057
PRIME ...........................................................................................................................2060
CONST ..........................................................................................................................2062
CUNIT ...........................................................................................................................2065
HYPOT ..........................................................................................................................2069
MP_SETUP ...................................................................................................................2071
Reference Material
2077
Contents .........................................................................................................................2077
User Errors.....................................................................................................................2077
ERSET ...........................................................................................................................2080
IERCD and N1RTY.......................................................................................................2081
Machine-Dependent Constants ......................................................................................2085
IMACH ..........................................................................................................................2085
AMACH.........................................................................................................................2087
DMACH.........................................................................................................................2088
IFNAN(X)......................................................................................................................2089
UMACH.........................................................................................................................2091
Matrix Storage Modes ...................................................................................................2093
Reserved Names ............................................................................................................2104
Deprecated Features and Renamed Routines.................................................................2105
Appendix A: Alphabetical Summary of Routines
2109
Appendix B: References
2141
Appendix C: Product Support
2159
Index
2161
ROGUEWAVE.COM
Contents
xvi
Introduction
The IMSL Fortran Numerical Library
The IMSL Fortran Numerical Library consists of two separate but coordinated Libraries that allow easy user
access. These Libraries are organized as follows:

MATH/LIBRARY general applied mathematics and special functions
The User’s Guide for IMSL MATH/LIBRARY has two parts:


MATH/LIBRARY

MATH/LIBRARY Special Functions
STAT/LIBRARY statistics
Most of the routines are available in both single and double precision versions. Many routines for linear solvers and eigensystems are also available for complex and double -complex precision arithmetic. The same user
interface is found on the many hardware versions that span the range from personal computer to
supercomputer.
This library is the result of a merging of the products: IMSL Fortran Numerical Libraries and IMSL Fortran 90
Library.
Introduction
1
User Background
To use this product you should be familiar with the Fortran 90 language as well as the withdrawn Fortran 77
language, which is, in practice, a subset of Fortran 90. A summary of the ISO and ANSI standard language is
found in Metcalf and Reid (1990). A more comprehensive illustration is given in Adams et al. (1992).
Those routines implemented in the IMSL Fortran Numerical Library provide a simpler, more reliable user
interface than was possible with Fortran 77. Features of the IMSL Fortran Numerical Library include the use
of descriptive names, short required argument lists, packaged user-interface blocks, a suite of testing and
benchmark software, and a collection of examples. Source code is provided for the benchmark software and
examples.
Some of the routines in the IMSL Fortran Numerical Library can take advantage of a standard (MPI) Message
Passing Interface environment but do not require an MPI environment if the user chooses to not take advantage of MPI.
The MPI logo shown below cues the reader when this is the case:
Routines documented with the MPI Capable logo can be called in a scalar or one computer environment.
Other routines in the IMSL Library take advantage of MPI and require that an MPI environment be present in
order to use them. The MPI Required logo shown below clues the reader when this is the case:
NOTE: It is recommended that users considering using the MPI capabilities of the product read the following sections of the MATH Library documentation:
Introduction: Using MPI Routines
Introduction: Using ScaLAPACK Enhanced Routines
Chapter 10, “Linear Algebra Operators and Generic Functions” – see “Dense Matrix Parallelism Using MPI”.
Vendor Supplied Libraries Usage
The IMSL Fortran Numerical Library contains functions which may take advantage of functions in vendor
supplied libraries such as the Intel® Math Kernel Library (MKL) or the Sun™ High Performance Library.
Functions in the vendor supplied libraries are finely tuned for performance to take full advantage of the
environment for which they are supplied. For these functions, the user of the IMSL Fortran Numerical
User Background
Introduction
2
Library has the option of linking to code which is based on either the IMSL legacy functions or the functions
in the vendor supplied library. The following icon in the function documentation alerts the reader when this
is the case:
Details on linking to the appropriate IMSL Library and alternate vendor supplied libraries are explained in
the online README file of the product distribution.
Getting Started
The IMSL MATH/LIBRARY is a collection of Fortran routines and functions useful in mathematical analysis
research and application development. Each routine is designed and documented for use in research activities as well as by technical specialists.
To use any of these routines, you must write a program in Fortran 90 (or possibly some other
language) to call the MATH/LIBRARY routine. Each routine conforms to established conventions in programming and documentation. We give first priority in development to efficient algorithms, clear
documentation, and accurate results. The uniform design of the routines makes it easy to use more than one
routine in a given application. Also, you will find that the design consistency enables you to apply your
experience with one MATH/LIBRARY routine to other IMSL routines that you use.
Finding the Right Routine
The MATH/LIBRARY is organized into chapters; each chapter contains routines with similar computational
or analytical capabilities. To locate the right routine for a given problem, you may use either the table of contents located in each chapter introduction, or the alphabetical list of routines.
Often the quickest way to use the MATH/LIBRARY is to find an example similar to your problem and then
to mimic the example. Each routine document has at least one example demonstrating its application. The
example for a routine may be created simply for illustration, it may be from a textbook (with reference to the
source), or it may be from the mathematical literature.
User Background
Introduction
3
Organization of the Documentation
This manual contains a concise description of each routine, with at least one demonstrated example of each
routine, including sample input and results. You will find all information pertaining to the MATH/LIBRARY
in this manual. Moreover, all information pertaining to a particular routine is in one place within a chapter.
Each chapter begins with an introduction followed by a table of contents that lists the routines included in
the chapter. Documentation of the routines consists of the following information:

IMSL Routine’s Generic Name

Purpose: a statement of the purpose of the routine. If the routine is a function rather than a
subroutine the purpose statement will reflect this fact.

Function Return Value: a description of the return value (for functions only).

Required Arguments: a description of the required arguments in the order of their occurrence.
Input arguments usually occur first, followed by input/output arguments, with output
arguments described last. Futhermore, the following terms apply to arguments:

Input Argument must be initialized; it is not changed by the routine.

Input/Output Argument must be initialized; the routine returns output through this
argument; cannot be a constant or an expression.

Input[/Output] Argument must be initialized; the routine may return output through
this argument based on other optional data the user may choose to pass to this routine;
cannot be a constant or an expression.

Input or Output Select appropriate option to define the argument as either input or output. See individual routines for further instructions.

Output No initialization is necessary; cannot be a constant or an expression. The routine
returns output through this argument.

Optional Arguments: a description of the optional arguments in the order of their occurrence.

Fortran 90 Interface: a section that describes the generic and specific interfaces to the routine.

Fortran 77 Style Interface: an optional section, which describes Fortran 77 style interfaces, is
supplied for backwards compatibility with previous versions of the Library.

ScaLAPACK Interface: an optional section, which describes an interface to a ScaLAPACK-based
version of this routine.

Description: a description of the algorithm and references to detailed information. In many
cases, other IMSL routines with similar or complementary functions are noted.

Comments: details pertaining to code usage.

Programming notes: an optional section that contains programming details not covered
elsewhere.

Example: at least one application of this routine showing input and required dimension and
type statements.
Organization of the Documentation
Introduction
4

Output: results from the example(s). Note that unique solutions may differ from platform to
platform.

Additional Examples: an optional section with additional applications of this routine showing
input and required dimension and type statements.
Naming Conventions
The names of the routines are mnemonic and unique. Most routines are available in both a single precision
and a double precision version, with names of the two versions sharing a common root. The root name is also
the generic interface name. The name of the double precision specific version begins with a “D_” and the single precision specific version begins with an “S_”. For example, the following pairs are precision specific
names of routines in the two different precisions: S_GQRUL/D_GQRUL (the root is “GQRUL ,” for “Gauss
quadrature rule”) and S_RECCF/D_RECCF (the root is “RECCF,” for “recurrence coefficient”). The precision
specific names of the IMSL routines that return or accept the type complex data begin with the letter “C_” or
“Z_” for complex or double complex, respectively. Of course, the generic name can be used as an entry point
for all precisions supported.
When this convention is not followed the generic and specific interfaces are noted in the documentation. For
example, in the case of the BLAS and trigonometric intrinsic functions where standard names are already
established, the standard names are used as the precision specific names. There may also be other interfaces
supplied to the routine to provide for backwards compatibility with previous versions of the IMSL Fortran
Numerical Library. These alternate interfaces are noted in the documentation when they are available.
Except when expressly stated otherwise, the names of the variables in the argument lists follow the Fortran
default type for integer and floating point. In other words, a variable whose name begins with one of the letters “I” through “N” is of type INTEGER, and otherwise is of type REAL or DOUBLE PRECISION , depending
on the precision of the routine.
An assumed-size array with more than one dimension that is used as a Fortran argument can have an
assumed-size declarator for the last dimension only. In the MATH/LIBRARY routines, the information about
the first dimension is passed by a variable with the prefix “LD” and with the array name as the root. For
example, the argument LDA contains the leading dimension of array A. In most cases, information about the
dimensions of arrays is obtained from the array through the use of Fortran 90’s size function. Therefore,
arguments carrying this type of information are usually defined as optional arguments.
Where appropriate, the same variable name is used consistently throughout a chapter in the
MATH/LIBRARY. For example, in the routines for random number generation, NR denotes the number of
random numbers to be generated, and R or IR denotes the array that stores the numbers.
When writing programs accessing the MATH/LIBRARY, the user should choose Fortran names that do not
conflict with names of IMSL subroutines, functions, or named common blocks. The careful user can avoid
any conflicts with IMSL names if, in choosing names, the following rules are observed:

Do not choose a name that appears in the Alphabetical Summary of Routines, at the end of the
User’s Manual, nor one of these names preceded by a D, S_, D_, C_, or Z_.
Organization of the Documentation
Introduction
5

Do not choose a name consisting of more than three characters with a numeral in the second or
third position.
For further details, see the section on Reserved Names in the Reference Material.
Using Library Subprograms
The documentation for the routines uses the generic name and omits the prefix, and hence the entire suite of
routines for that subject is documented under the generic name.
Examples that appear in the documentation also use the generic name. To further illustrate this principle,
note the LIN_SOL_GEN documentation (see Chapter 1, “Linear Systems”), for solving general systems of linear
algebraic equations. A description is provided for just one data type. There are four documented routines in
this subject area: s_lin_sol_gen, d_lin_sol_gen, c_lin_sol_gen, and z_lin_sol_gen.
These routines constitute single-precision, double-precision, complex, and double-complex precision versions of the code.
The Fortran 90 compiler identifies the appropriate routine. Use of a module is required with the routines. The
naming convention for modules joins the suffix “_int” to the generic routine name. Thus, the line “use
lin_sol_gen_int” is inserted near the top of any routine that calls the subprogram “lin_sol_gen”.
More inclusive modules are also available, such as imsl_libraries and numerical libraries. To
avoid name conflicts, Fortran 90 permits re-labeling names defined in modules so they do not conflict with
names of routines or variables in the user’s program. The user can also restrict access to names defined in
IMSL Library modules by use of the “: ONLY, <list of names>” qualifier.
When dealing with a complex matrix, all references to the transpose of a matrix, $7 , are replaced by the adjoint
matrix
7
$Ǧ Ł $
$+
where the overstrike denotes complex conjugation. IMSL Fortran Numerical Library linear algebra software
uses this convention to conserve the utility of generic documentation for that code subject. All references to
orthogonal matrices are to be replaced by their complex counterparts, unitary matrices. Thus, an n × n orthogonal matrix Q satisfies the condition 47 4
for complex matrices, 9 9
, Q. An n × n unitary matrix V satisfies the analogous condition
, Q.
Programming Conventions
In general, the IMSL MATH/LIBRARY codes are written so that computations are not affected by underflow,
provided the system (hardware or software) places a zero value in the register. In this case, system error messages indicating underflow should be ignored.
Organization of the Documentation
Introduction
6
IMSL codes are also written to avoid overflow. A program that produces system error messages indicating
overflow should be examined for programming errors such as incorrect input data, mismatch of argument
types, or improper dimensioning.
In many cases, the documentation for a routine points out common pitfalls that can lead to failure of the
algorithm.
Library routines detect error conditions, classify them as to severity, and treat them accordingly. This errorhandling capability provides automatic protection for the user without requiring the user to make any specific provisions for the treatment of error conditions. See the section on User Errors in the Reference Material
for further details.
Module Usage
Users are required to incorporate a “use” statement near the top of their program for the IMSL routine being
called when writing new code that uses this library. However, legacy code which calls routines in the previous version of the library without the use of a “use” statement will continue to work as before. Also, code
that employed the “use numerical_libraries” statement from the previous version of the library will
continue to work properly with this version of the library.
Users wishing to update existing programs so as to call other routines from this library should incorporate a
use statement for the specific new routine being called. (Here, the term “new routine” implies any routine in
the library, only “new” to the user’s program.) Use of the more encompassing “imsl_libraries” module
in this case could result in argument mismatches for the “old” routine(s) being called. (The compiler would
catch this.)
Users wishing to update existing programs to call the new generic versions of the routines must change their
calls to the existing routines to match the new calling sequences and use either the routine specific interface
modules or the all-encompassing “imsl_libraries” module.
Using MPI Routines
Users of the IMSL Fortran Numerical Library benefit by having a standard (MPI) Message Passing Interface
environment. This is needed to accomplish parallel computing within parts of the Library. Either of the icons
above clues the reader when this is the case. If parallel computing is not required, then the IMSL Library suite of
dummy MPI routines can be substituted for standard MPI routines. All requested MPI routines called by the
Organization of the Documentation
Introduction
7
IMSL Library are in this dummy suite. Warning messages will appear if a code or example requires more
than one process to execute. Typically users need not be aware of the parallel codes.
NOTE: that a standard MPI environment is not part of the IMSL Fortran Numerical Library. The standard
includes a library of MPI Fortran and C routines, MPI “include” files, usage documentation, and other
run-time utilities.
NOTE: Details on linking to the appropriate libraries are explained in the online README file of the product
distribution.
There are three situations of MPI usage in the IMSL Fortran Numerical Library:
1.
There are some computations that are performed with the ‘box’ data type that benefit from the use of
parallel processing. For computations involving a single array or a single problem, there is no IMSL
use of parallel processing or MPI codes. The box type data type implies that several problems of the
same size and type are to be computed and solved. Each rack of the box is an independent problem.
This means that each problem could potentially be solved in parallel. The default for computing a box
data type calculation is that a single processor will do all of the problems, one after the other. If this is
acceptable there should be no further concern about which version of the libraries is used for linking.
If the problems are to be solved in parallel, then the user must link with a working version of an MPI
Library and the appropriate IMSL Library. Examples demonstrating the use of box type data may be
found in Chapter 10, “Linear Algebra Operators and Generic Functions”.
NOTE: Box data type routines are marked with the MPI Capable icon.
2.
Various routines in Chapter 1, “Linear Systems” allow the user to interface with the ScaLAPACK Library
routines. If the user chooses to run on only one processor then these routines will utilize either IMSL
Library code or LAPACK Library code based on the libraries the user chooses to use during linking. If
the user chooses to run on multiple processors then working versions of MPI, ScaLAPACK, PBLAS,
and Blacs will need to be present. These routines are marked with the MPI Capable icon.
3.
There are some routines or operators in the Library that require that a working MPI Library be present
in order for them to run. Examples are the large-scale parallel solvers and the ScaLAPACK utilities.
Routines of this type are marked with the MPI Required icon. For these routines, the user must link
with a working version of an MPI Library and the appropriate IMSL Library.
In all cases described above it is the user’s responsibility to supply working versions of the aforementioned
third party libraries when those libraries are required.
Table 1 below lists the chapters and IMSL routines calling MPI routines or the replacement non-parallel
package.
Table 1 — IMSL Routines Calling MPI Routines or Replacement Non-Parallel Package
Chapter Name and Number
Routine with MPI Utilized
Linear Systems, 1
PARALLEL_NONNEGATIVE_LSQ
Linear Systems, 1
PARALLEL_BOUNDED_LSQ
Linear Systems, 1
Those routines which utilize ScaLAPACK listed in
Table D below.
Organization of the Documentation
Introduction
8
Table 1 — IMSL Routines Calling MPI Routines or Replacement Non-Parallel Package
Chapter Name and Number
Routine with MPI Utilized
Linear Algebra and Generic Functions, 10
See entire following Table 2, “Defined Operators and Generic
Functions for Dense Arrays.”
Utilities, 11
ScaLAPACK_SETUP
Utilities, 11
ScaLAPACK_GETDIM
Utilities, 11
ScaLAPACK_READ
Utilities, 11
ScaLAPACK_WRITE
Utilities, 11
ScaLAPACK_MAP
Utilities, 11
ScaLAPACK_UNMAP
Utilities, 11
ScaLAPACK_EXIT
Reference Material
Entire Error Processor Package for IMSL Library, if MPI is
utilized
Programming Tips
Each subject routine called or otherwise referenced requires the “use” statement for an interface block
designed for that subject routine. The contents of this interface block are the interfaces to the separate routines available for that subject. Packaged descriptive names for option numbers that modify documented
optional data or internal parameters might also be provided in the interface block. Although this seems like
an additional complication, many errors are avoided at an early stage in development through the use of
these interface blocks. The “use” statement is required for each routine called in the user’s program. As illustrated in Examples 3 and 4 in routine lin_geig_gen, the “use” statement is required for defining the
secondary option flags.
The function subprogram for s_NaN() or d_NaN() does not require an interface block because it has only a
single “required” dummy argument. Also, if one is only using the Fortran 77 interfaces supplied for backwards compatibility then the “use” statements are not required.
Optional Subprogram Arguments
IMSL Fortran Numerical Library routines have required arguments and may have optional arguments. All
arguments are documented for each routine. For example, consider the routine lin_sol_gen that solves the
linear algebraic matrix equation Ax = b. The required arguments are three rank-2 Fortran 90 arrays: A, b, and
x. The input data for the problem are the A and b arrays; the solution output is the x array. Often there are
other arguments for this linear solver that are closely connected with the computation but are not as compelling as the primary problem. The inverse matrix A-1 may be needed as part of a larger application. To output
Organization of the Documentation
Introduction
9
this parameter, use the optional argument given by the “ainv=” keyword. The rank-2 output array argument used on the right-hand side of the equal sign contains the inverse matrix. See Example 2 of
LIN_SOL_GEN in Chapter 1, “Linear Systems” for an example of computing the inverse matrix.
For compatibility with previous versions of the IMSL Libraries, the NUMERICAL_LIBRARIES interface module includes backwards-compatible positional argument interfaces to all routines that existed in the Fortran
77 version of the Library. Note that it is not necessary to include “use” statements when calling these routines
by themselves. Existing programs that called these routines will continue to work in the same manner as
before.
Some of the primary routines have arguments “epack=” and “iopt=”. As noted the “epack=” argument is
of derived type s_error or d_error. The prefix “s_” or “d_” is chosen depending on the precision of the
data type for that routine. These optional arguments are part of the interface to certain routines, and are used
to modify internal algorithm choices or other parameters.
Optional Data
This additional optional argument (available for some routines) is further distinguished—a derived type
array that contains a number of parameters to modify the internal algorithm of a routine. This derived type
has the name ?_options, where “?_” is either “s_” or “d_”. The choice depends on the precision of the
data type. The declaration of this derived type is packaged within the modules for these codes.
The definition of the derived types is:
type ?_options
integer idummy; real(kind(?)) rdummy
end type
where the “?_” is either “s_” or “d_”, and the kind value matches the desired data type indicated by the
choice of “s” or “d”.
Example 3 of LIN_SOL_GEN in Chapter 1, “Linear Systems” illustrates the use of iterative refinement to compute a double-precision solution based on a single-precision factorization of the matrix. This is
communicated to the routine using an optional argument with optional data. For efficiency of iterative
refinement, perform the factorization step once, and then save the factored matrix in the array A and the pivoting information in the rank-1 integer array, ipivots. By default, the factorization is normally discarded.
To enable the routine to be re-entered with a previously computed factorization of the matrix, optional data
are used as array entries in the “iopt=” optional argument. The packaging of LIN_SOL_GEN includes the
definitions of the self-documenting integer parameters lin_sol_gen_save_LU and
lin_sol_gen_solve_A. These parameters have the values 2 and 3, but the programmer usually does not
need to be aware of it.
The following rules apply to the “iopt=iopt” optional argument:
1.
Define a relative index, for example IO, for placing option numbers and data into the array argument
iopt. Initially, set IO = 1. Before a call to the IMSL Library routine, follow Steps 2 through 4.
2.
The data structure for the optional data array has the following form:
iopt (IO) = ?_options (Option_number, Optional_data)
[iopt (IO + 1) =?_options (Option_number, Optional_data)]
Organization of the Documentation
Introduction
10
The length of the data set is specified by the documentation for an individual routine. (The
Optional_data is output in some cases and may not be used in other cases.) The square braces […]
denote optional items.
Illustration: Example 3 of LIN_EIG_SELF in Chapter 2, “Singular Value and Eigenvalue Decomposition”, a
new definition for a small diagonal term is passed to lin_sol_self. There is one line of code
required for the change and the new tolerance:
iopt (1) = d_options(d_lin_sol_self_set_small,
epsilon(one) *abs (d(i)))
3.
The internal processing of option numbers stops when Option_number == 0 or when
IO > SIZE(iopt). This signals each routine having this optional argument that all desired changes to
default values of internal parameters have been made. This implies that the last option number is the
value zero or the value of SIZE(iopt) matches the last optional value changed.
4.
To add more options, replace IO with IO + n, where n is the number of items required for the previous
option. Go to Step 2.
Option numbers can be written in any order, and any selected set of options can be changed from the
defaults. They may be repeated. Example 3 in of LIN_SOL_SELF in Chapter 1, “Linear Systems” uses three and
then four option numbers for purposes of computing an eigenvector associated with a known eigenvalue.
Overloaded =, /=, etc., for Derived Types
To assist users in writing compact and readable code, the IMSL Fortran Numerical Library provides overloaded assignment and logical operations for the derived types s_options, d_options, s_error, and
d_error. Each of these derived types has an individual record consisting of an integer and a floating-point
number. The components of the derived types, in all cases, are named idummy followed by rdummy. In many
cases, the item referenced is the component idummy. This integer value can be used exactly as any integer by
use of the component selector character (%). Thus, a program could assign a value and test after calling a
routine:
s_epack(1)%idummy = 0
call lin_sol_gen(A,b,x,epack=s_epack)
if (s_epack(1)%idummy > 0) call error_post(s_epack)
Using the overloaded assignment and logical operations, this code fragment can be written in the equivalent
and more readable form:
s_epack(1) = 0
call lin_sol_gen(A,b,x,epack=s_epack)
if (s_epack(1) > 0) call error_post(s_epack)
Generally the assignments and logical operations refer only to component idummy. The assignment
“s_epack(1)=0” is equivalent to “s_epack(1)=s_error(0,0E0)”. Thus, the floating-point component
rdummy is assigned the value 0E0. The assignment statement “I=s_epack(1)”, for I an integer type, is
Organization of the Documentation
Introduction
11
equivalent to “I=s_epack(1)%idummy”. The value of component rdummy is ignored in this assignment.
For the logical operators, a single element of any of the IMSL Fortran Numerical Library derived types can
be in either the first or second operand.
Derived Type
Overloaded Assignments and Tests
s_options
I=s_options(1);s_options(1)=I = =
/=
<
<=
>
>=
d_options
I=d_options(1);d_options(1)=I = =
/=
<
<=
>
>=
s_epack
I=s_epack(1);s_epack(1)=I
= =
/=
<
<=
>
>=
d_epack
I=d_epack(1);d_epack(1)=I
= =
/=
<
<=
>
>=
In the examples, operator_ex01, … , _ex37, the overloaded assignments and tests have been used whenever they improve the readability of the code.
Error Handling
The routines in the IMSL MATH/LIBRARY attempt to detect and report errors and invalid input. Errors are
classified and are assigned a code number. By default, errors of moderate or worse severity result in messages being automatically printed by the routine. Moreover, errors of worse severity cause program
execution to stop. The severity level and the general nature of the error are designated by an “error type”
ranging from 0 to 5. An error type 0 is no error; types 1 through 5 are progressively more severe. In most
cases, you need not be concerned with our method of handling errors. For those interested, a complete
description of the error-handling system is given in the Reference Material, which also describes how you can
change the default actions and access the error code numbers.
A separate error handler is provided to allow users to handle errors of differing types being reported from
several nodes without danger of “jumbling” or mixing error messages. The design of this error handler is
described more fully in Hanson (1992). The primary feature of the design is the use of a separate array for
each parallel call to a routine. This allows the user to summarize errors using the routine error_post in a
non-parallel part of an application. For a more detailed discussion of the use of this error handler in applications which use MPI for distributed computing, see the Reference Material.
Organization of the Documentation
Introduction
12
Printing Results
Most of the routines in the IMSL MATH/LIBRARY (except the line printer routines and special utility routines) do not print any of the results. The output is returned in Fortran variables, and you can print these
yourself. See Chapter 11, “Utilities” for detailed descriptions of these routines.
A commonly used routine in the examples is the IMSL routine UMACH (see the Reference Material), which
retrieves the Fortran device unit number for printing the results. Because this routine obtains device unit
numbers, it can be used to redirect the input or output. The section on Machine-Dependent Constants in the
Reference Material contains a description of the routine UMACH.
Fortran 90 Constructs
The IMSL Fortran Numerical Library contains routines which take advantage of Fortran 90 language constructs, including Fortran 90 array data types. One feature of the design is that the default use may be as
simple as the problem statement. Complicated, professional-quality mathematical software is hidden from
the casual or beginning user.
In addition, high-level operators and functions are provided in the Library. They are described in Chapter 10,
“Linear Algebra Operators and Generic Functions”.
Shared-Memory Multiprocessors and Thread Safety
The IMSL Fortran Numerical Library allows users to leverage the high-performance technology of shared
memory parallelism (SMP) when their environment supports it. Support for SMP systems within the IMSL
Library is delivered through various means, depending upon the availability of technologies such as
OpenMP, high performance LAPACK and BLAS, and hardware-specific IMSL algorithms. Use of the IMSL
Fortran Numerical Library on SMP systems can be achieved by using the appropriate link environment variable when building your application. Details on the available link environment variables for your installation
of the IMSL Fortran Numerical Library can be found in the online README file of the product distribution.
Organization of the Documentation
Introduction
13
The IMSL Fortran Numerical Library is thread-safe in those environments that support OpenMP. This was
achieved by using OpenMP directives that define global variables located in the code so they are private to
the individual threads. Thread safety allows users to create instances of routines running on multiple threads
and to include any routine in the IMSL Fortran Numerical Library in these threads.
Using Operators and Generic Functions
For users who are primarily interested in easy-to-use software for numerical linear algebra, see Chapter 10,
“Linear Algebra Operators and Generic Functions”. This compact notation for writing Fortran 90 programs,
when it applies, results in code that is easier to read and maintain than traditional subprogram usage.
Users may begin their code development using operators and generic functions. If a more efficient executable
code is required, a user may need to switch to equivalent subroutine calls using IMSL Fortran Numerical
Library routines.
Table 2 and Table 3 contain lists of the defined operators and some of their generic functions.
Table 2 — Defined Operators and Generic Functions for Dense Arrays
Defined Array Operation
Matrix Operation
A .x. B
AB
.i. A
A-1
.t. A, .h. A
AT,A*
A .ix. B
A-1B
B .xi. A
BA-1
A .tx. B, or (.t. A) .x. B
ATB,A*B
A .hx. B, or (.h. A) .x. B
B .xt. A, or B .x. (.t. A)
BAT,BA*
B .xh. A, or B .x. (.h. A)
S=SVD(A [,U=U, V=V])
A = USVT
E=EIG(A [[,B=B, D=D], V=V, W=W])
(AV = VE), AVD = BVE, (AW = WE), AWD = BWE
R=CHOL(A)
A = RTR
Q=ORTH(A [,R=R])
(A = QR),QTQ = I
U=UNIT(A)
[u1,…] = [a1/∥a1∥,…]
F=DET(A)
det(A) = determinant
K=RANK(A)
rank(A) = rank
Organization of the Documentation
Introduction
14
Table 2 — Defined Operators and Generic Functions for Dense Arrays
P=NORM(A[,[type=]i])
ӝӝ
S
PD[ M ™ ӛDL Mӛ
Lí
ӝ $ӝ V ODUJHVW VLQJXODU YDOXH
S
S
P
$
ӝӝ
$
Q
’ļKXJH
PD[L ™ ӛDL Mӛ
C=COND(A)
ӝ $íӝ ջ ӝ$ӝ
Z=EYE(N)
Z = IN
A=DIAG(X)
A = diag(x1,…)
X=DIAGONALS(A)
x = (x11,…)
W=FFT(Z); Z=IFFT(W)
Discrete Fourier Transform, Inverse
A=RAND(A)
random numbers, 0 < A < 1
L=isNaN(A)
test for NaN, if (l) then…
Lí
Table 3 — Defined Operators and Generic Functions for Harwell-Boeing Sparse Matrices
Defined Operation
Matrix Operation
Data Management
Define entries of sparse matrices
A .x. B
AB
.t. A, .h. A
AT,A*
A .ix. B
A-1B
B .xi. A
BA-1
A .tx. B, or (.t. A) .x. B
ATB,A*B
A .hx. B, or (.h. A) .x. B
B .xt. A, or B .x. (.t. A)
BAT,BA*
B .xh. A, or B .x. (.h. A)
A+B
Sum of two sparse matrices
C=COND(A)
ӝ $íӝ ջ ӝ$ӝ
Organization of the Documentation
Introduction
15
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK
Many of the codes in the IMSL Library are based on LINPACK, Dongarra et al. (1979), and EISPACK, Smith
et al. (1976), collections of subroutines designed in the 1970s and early 1980s. LAPACK, Anderson et al.
(1999), was designed to make the linear solvers and eigensystem routines run more efficiently on high performance computers. For a number of IMSL routines, the user of the IMSL Fortran Numerical Library has the
option of linking to code which is based on either the legacy routines or the more efficient LAPACK routines.
Table 4 below lists the IMSL routines that make use of LAPACK codes. The intent is to obtain improved performance for IMSL codes by using LAPACK codes that have good performance by virtue of using BLAS with
good performance. To obtain improved performance we recommend linking with High Performance versions of LAPACK and BLAS, if available. The LAPACK, codes are listed where they are used. Details on
linking to the appropriate IMSL Library and alternate libraries for LAPACK and BLAS are explained in the
online README file of the product distribution.
Table 4 — IMSL Routines and LAPACK Routines Utilized Within
Generic Name of
IMSL Routine
LAPACK Routines
used when Linking with
High Performance Libraries
LSARG
?GERFS,?GETRF,?GECON, ?=S/D
LSLRG
?GETRF, ?GETRS, ?=S/D
LFCRG
?GETRF,?GECON, ?=S/D
LFTRG
?GETRF, ?=S/D
LFSRG
?GETRS, ?=S/D
LFIRG
?GETRS, ?=S/D
LINRG
?GETRF, ?GETRI ?=S/D
LSACG
?GETRF, GETRS, ?GECON, ?=C/Z
LSLCG
?GETRF, ?GETRS, ?=C/Z
LFCCG
?GETRF, ?GECON, ?=C/Z
LFTCG
?GETRF, ?C/Z
LFSCG
?GETRS, ?C/Z
LFICG
?GERFS,?GETRS, ?=C/Z
LINCG
?GETRF, ?GETRI, ?=C/Z
LSLRT
?TRTRS, ?=S/D
LFCRT
?TRCON, ?=S/D
LSLCT
?TRTRS, ?=C/Z
LFCCT
?TRCON, ?=C/Z
LSADS
?PORFS, ?POTRS, ?=S/D
LSLDS
?POTRF,
?POTRS, ?=S/D
LFCDS
?POTRF,
?POCON, ?=S/D
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK
Introduction
16
Table 4 — IMSL Routines and LAPACK Routines Utilized Within
LFTDS
?POTRF, ?=S/D
LFSDS
?POTRS, ?=S/D
LFIDS
?PORFS, ?POTRS, ?=S/D
LINDS
?POTRF, ?=S/D
LSASF
?SYRFS, ?SYTRF, ?SYTRS, ?=S/D
LSLSF
?SYTRF, ?SYTRS, ?=S/D
LFCSF
?SYTRF, ?SYCON, ?=S/D
LFTSF
?SYTRF,
?=S/D
LFSSF
?SYTRF,
?=S/D
LFISF
?SYRFS, ?=S/D
LSADH
?POCON, ?POTRF, ?POTRS, ?=C/Z
LSLDH
?TRTRS, ?POTRF, ?=C/Z
LFCDH
?POTRF, ?POCON, ?=C/Z
LFTDH
?POTRF, ?=C/Z
LFSDH
?TRTRS, ?=C/Z
LFIDH
?PORFS, ?POTRS, ?=C/Z
LSAHF
?HECON, ?HERFS, ?HETRF, ?HETRS, ?=C/Z
LSLHF
?HECON, ?HETRF, ?HETRS, ?=C/Z
LFCHF
?HETRF, ?HECON, ?=C/Z
LFTHF
?HETRF, ?=C/Z
LFSHF
?HETRS, ?=C/Z
LFIHF
?HERFS, ?HETRS, ?=C/Z
LSARB
?GBTRF, ?GBTRS, ?GBRFS ?=S/D
LSLRB
?GBTRF, ?GBTRS, ?=S/D
LFCRB
?GBTRF, ?GBCON, ?=S/D
LFTRB
?GBTRF, ?=S/D
LFSRB
?GBTRS, ?=S/D
LFIRB
?GBTRS, ?GBRFS, ?=S/D
LSQRR
?GEQP3, ?GEQRF, ?ORMQR, ?TRTRS. ?=S/D
LQRRV
?GEQP3, ?GEQRF, ?ORMQR, ?=S/D
LSBRR
?GEQRF, ?=S/D
LQRRR
?GEQRF, ?=S/D
LSVRR
?GESVD, ?-S/D
LSVCR
?GESVD, ?=C/Z
LSGRR
?GESVD, ?=S/D
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK
Introduction
17
Table 4 — IMSL Routines and LAPACK Routines Utilized Within
LQRSL
?TRTRS, ?ORMQR, ?=S/D
LQERR
?ORGQR, ?=S/D
EVLRG
?GEBAL, ?GEHRD, ?HSEQR ?=S/D
EVCRG
?GEEVX, ?=S/D
EVLCG
?HSEQR, ?GEBAL, ?GEHRD, ?=C/Z
EVCCG
?GEEV, ?=C/Z
EVLSF
?SYEV, ?=S/D
EVCSF
?SYEV, ?=S/D
EVLHF
?HEEV, ?=C/Z
EVCHF
?HEEV, ?=C/Z
GVLRG
?GEQRF, ?ORMQR, ?GGHRD, ?HGEQZ, ?=S/D
GVCRG
?GEQRF, ?ORMQR, ?GGHRD, ?HGEQZ, ?TGEVC,
?=S/D
GVLCG
?GEQRF, ?UMMQR, ?GGHRD, ?HGEQZ,?=C/Z
GVCCG
?GEQRF, ?UMMQR, ?GGHRD, ?HGEQZ,?TGEVC,?=C/Z
GVLSP
?SYGV, ?=S/D
GVCSP
?SYGV, ?=S/D
ScaLAPACK, Blackford et al. (1997), includes a subset of LAPACK codes redesigned for use on distributed
memory MIMD parallel computers. A number of IMSL Library routines make use of a subset of the
ScaLAPACK library.
Table 5 below lists the IMSL routines that make use of ScaLAPACK codes. The intent is to provide access to
the ScaLAPACK codes through the familiar IMSL routine interface. The IMSL routines that utilize
ScaLAPACK codes have a ScaLAPACK Interface documented in addition to the FORTRAN 90 Interface. Like
the LAPACK codes, access to the ScaLAPACK codes is made by linking to the appropriate library. Details on
linking to the appropriate IMSL Library and alternate libraries for ScaLAPACK and BLAS are explained in
the online README file of the product distribution.
Table 5 — IMSL Routines and ScaLAPACK Routines Utilized Within
Generic Name of
IMSL Routine
ScaLAPACK Routines
used when Linking with
High Performance Libraries
LSARG
P?GERFS,P?GETRF,P?GETRS, ?=S/D
LSLRG
P?GETRF, P?GETRS, ?=S/D
LFCRG
P?GETRF,P?GECON, ?=S/D
LFTRG
P?GETRF, ?=S/D
LFSRG
P?GETRS, ?=S/D
LFIRG
P?GETRS, P?GERFS, ?=S/D
LINRG
P?GETRF, P?GETRI ?=S/D
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK
Introduction
18
Table 5 — IMSL Routines and ScaLAPACK Routines Utilized Within
Generic Name of
IMSL Routine
ScaLAPACK Routines
used when Linking with
High Performance Libraries
LSACG
P?GETRF, P?GETRS, P?GERFS, ?=C/Z
LSLCG
P?GETRF, P?GETRS, ?=C/Z
LFCCG
P?GETRF, P?GECON, ?=C/Z
LFTCG
P?GETRF, ?C/Z
LFSCG
P?GETRS, ?C/Z
LFICG
P?GERFS,P?GETRS, ?=C/Z
LINCG
P?GETRF, P?GETRI, ?=C/Z
LSLRT
P?TRTRS, ?=S/D
LFCRT
P?TRCON, ?=S/D
LSLCT
P?TRTRS, ?=C/Z
LFCCT
P?TRCON, ?=C/Z
LSADS
P?PORFS, P?POTRF, P?POTRS, ?=S/D
LSLDS
P?POTRF,
P?POTRS, ?=S/D
LFCDS
P?POTRF,
P?POCON, ?=S/D
LFTDS
P?POTRF, ?-S/D
LFSDS
P?POTRS, ?-S/D
LFIDS
P?PORFS, P?POTRS, ?=S/D
LINDS
P?GETRF, P?GETRI, ?=S/D
LSADH
P?POTRF, P?PORFS, P?POTRS, ?=C/Z
LSLDH
P?POTRS, P?POTRF, ?=C/Z
LFCDH
P?POTRF, P?POCON, ?=C/Z
LFTDH
P?POTRF, ?=C/Z
LFSDH
P?POTRS, ?=C/Z
LFIDH
P?PORFS, P?POTRS, ?=C/Z
LSLRB
P?GBTRF, P?GBTRS, ?=S/D
LSQRR
P?GEQPF, P?GEQRF, P?ORMQR, P?TRTRS, ?=S/D
LQRRV
P?TRTRS, P?GEQRF, P?ORMQR, ?=S/D
LQRRR
P?GEQRF, P?GEQPF, P?ORMQR, ?=S/D
LSVRR
P?GESVD, ?-S/D
LSGRR
P?GESVD, ?=S/D
LQRSL
P?TRTRS, P?ORMQR, ?=S/D
LQERR
P?ORGQR, ?=S/D
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK
Introduction
19
Using ScaLAPACK Enhanced Routines
General Remarks
Use of the ScaLAPACK enhanced routines allows a user to solve large linear systems of algebraic equations
at a performance level that might not be achievable on one computer by performing the work in parallel
across multiple computers. One might also use these routines on linear systems that prove to be too large for
the address space of the target computer. Rogue Wave has tried to facilitate the use of parallel computing in
these situations by providing interfaces to ScaLAPACK routines which accomplish the task. The IMSL
Library solver interface has the same look and feel whether one is using the routine on a single computer or
across multiple computers.
The basic steps required to utilize the IMSL routines which interface with ScaLAPACK routines are:
1.
Initialize MPI
2.
Initialize the processor grid
3.
Define any necessary array descriptors
4.
Allocate space for the local arrays
5.
Set up local matrices across the processor grid
6.
Call the IMSL routine which interfaces with ScaLAPACK
7.
Gather the results from across the processor grid
8.
Release the processor grid
9.
Exit MPI
Utilities are provided in the IMSL Library that facilitate these steps for the user. Each of these utilities is documented in Chapter 11, “Utilities”. We visit the steps briefly here:
1. Initialize MPI
The user should call MP_SETUP() in this step. This function is described in detail in“Getting Started with
Modules MPI_setup_int and MPI_node_int” in Chapter 10, “Linear Algebra Operators and Generic Functions”. For
ScaLAPACK usage, suffice it to say that following a call to the function MP_SETUP(), the module
MPI_node_int will contain information about the number of processors, the rank of a processor, and the
communicator for the application. A call to this function will return the number of processors available to
the program. Since the module MPI_node_int is used by MPI_setup_int, it is not necessary to explicitly
use the module MPI_node_int. If MP_SETUP() is not called, the program computes entirely on one node.
No routine from MPI is called.
Using ScaLAPACK Enhanced Routines
Introduction
20
2. Initialize the processor grid
SCALAPACK_SETUP (see Chapter 11, “Utilities”) is called at this step. This call will set up the processor grid
for the user, define the context ID variable, MP_ICTXT, for the processor grid, and place MP_ICTXT into the
module GRIDINFO_INT. Use of SCALAPACK_SUPPORT will make the information in MPI_NODE_INT and
GRIDINFO_INT available to the user’s program.
3. Define any necessary array descriptors
Consider the generic matrix A which is to be carved up and distributed across the processors in the processor
grid. In ScaLAPACK parlance, we refer to A as being the “global” array A which is to be distributed across the
processor grid in 2D block cyclic fashion (see Chapter 11, “Utilities”). Each processor in the grid will then have
access to a subset of the global array A. We refer to the subset array to which the individual processor has
access as the “local” array A0. Just as it is sometimes necessary for a program to be aware of the leading
dimension of the global array A, it is also necessary for the program to be aware of other critical information
about the local array A0. This information can be obtained by calling the IMSL utility SCALAPACK_GETDIM.
The ScaLAPACK Library utility DESCINIT is then used to store this information in a vector. (For more information, see the Usage Notes section of Chapter 11, “Utilities”.)
4. Allocate space for the local arrays
The array dimensions, obtained in the previous step, are used at this point to allocate space for any local
arrays that will be used in the call to the IMSL routine.
5. Set up local matrices across the processor grid
If the matrices to be used by the solvers have not been distributed across the processor grid, IMSL provides
utility routines SCALAPACK_READ and SCALAPACK_MAP to help in the distribution of global arrays across
processors. SCALAPACK_READ will read data from a file while SCALAPACK_MAP will map a global array to
the processor grid. Users may choose to distribute the arrays themselves as long as they distribute the arrays
in 2D block cyclic fashion consistent with the array descriptors that have been defined.
6. Call the IMSL routine which interfaces with ScaLAPACK
The IMSL routines which interface with ScaLAPACK are listed in Table 5.
7. Gather the results from across the processor grid
IMSL provides utility routines SCALAPACK_WRITE and SCALAPACK_UNMAP to help in the gathering of
results from across processors to a global array or file. SCALAPACK_WRITE will write data to a file while
SCALAPACK_UNMAP will map local arrays from the processor grid to a global array.
8. Release the processor grid
This is accomplished by a call to SCALAPACK_EXIT.
9. Exit MPI
A call to MP_SETUP with the argument ‘FINAL’ will shut down MPI and set the value of MP_NPROCS = 0.
This flags that MPI has been initialized and terminated. It cannot be initialized again in the same program
unit execution. No MPI routine is defined when MP_NPROCS has this value.
Using ScaLAPACK Enhanced Routines
Introduction
21
Using ScaLAPACK Enhanced Routines
Introduction
22
Chapter 1: Linear Systems
Routines
1.1
1.1.1
1.1.2
1.1.3
1.1.4
1.1.5
1.1.6
1.2
1.2.1
1.2.2
1.2.3
Linear Solvers
Solves a general system of linear equations Ax = b . . . . . . . . . . . . . .LIN_SOL_GEN
Solves a system of linear equations Ax = b,
where A is a self-adjoint matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIN_SOL_SELF
Solves a rectangular system of linear
equations Ax ≅ b, in a least-squares sense . . . . . . . . . . . . . . . . . . . . LIN_SOL_LSQ
Solves a rectangular least-squares system of linear
equations Ax ≅ b using singular value decomposition . . . . . . . . . . . . LIN_SOL_SVD
Solves multiple systems of linear equations . . . . . . . . . . . . . . . . . . . . .LIN_SOL_TRI
Computes the singular value decomposition (SVD)
of a rectangular matrix, A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIN_SVD
33
42
52
61
70
83
Large-Scale Parallel Solvers
Parallel Constrained Least-Squares Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
Solves a linear, non-negative constrained
least-squares system . . . . . . . . . . . . . . . . . . . . . . PARALLEL_NONNEGATIVE_LSQ 93
Solves a linear least-squares system
with bounds on the unknowns . . . . . . . . . . . . . . . . . . PARALLEL_BOUNDED_LSQ 101
1.3
Solution of Linear Systems, Matrix Inversion, and Q Determinant Evaluation
1.3.1
Real General Matrices
High accuracy linear system solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSARG
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLRG
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCRG
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTRG
Solves after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSRG
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . .LFIRG
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDRG
Inverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LINRG
109
114
120
126
131
136
141
143
Chapter 1: Linear Systems
23
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7
Complex General Matrices
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSACG
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLCG
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCCG
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTCG
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSCG
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . .LFICG
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDCG
Inverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LINCG
Real Triangular Matrices
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLRT
Computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCRT
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDRT
Inverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LINRT
Complex Triangular Matrices
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLCT
Computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCCT
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDCT
Inverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LINCT
Real Positive Definite Matrices
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSADS
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLDS
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCDS
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTDS
Solve a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSDS
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . . LFIDS
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDDS
Inverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LINDS
Real Symmetric Matrices
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSASF
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLSF
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCSF
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTSF
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSSF
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . . LFISF
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDSF
Complex Hermitian Positive Definite Matrices
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSADH
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLDH
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCDH
147
152
157
163
168
173
178
180
185
189
193
195
197
201
205
207
209
214
219
224
229
234
239
241
245
248
251
254
257
260
263
265
270
275
Chapter 1: Linear Systems
24
1.3.8
1.3.9
1.3.10
1.3.11
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTDH
281
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSDH
286
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . .LFIDH
291
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDDH
297
Complex Hermitian Matrices
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSAHF
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLHF
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCHF
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTHF
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSHF
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . . LFIHF
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDHF
Real Band Matrices in Band Storage
Solves a tridiagonal system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLTR
Solves a tridiagonal system: Cyclic Reduction. . . . . . . . . . . . . . . . . . . . . . . . LSLCR
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSARB
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLRB
Factors and compute condition number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCRB
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTRB
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSRB
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . . LFIRB
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDRB
Real Band Symmetric Positive Definite Matrices in Band Storage
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSAQS
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLQS
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLPB
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCQS
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTQS
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSQS
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . .LFIQS
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDQS
Complex Band Matrices in Band Storage
Solves a tridiagonal system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLTQ
Solves a tridiagonal system: Cyclic Reduction. . . . . . . . . . . . . . . . . . . . . . . . LSLCQ
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSACB
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLCB
Factors and computes condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCCB
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTCB
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSCB
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . . LFICB
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDCB
299
302
305
308
311
314
317
319
321
324
327
332
336
339
342
345
347
350
353
356
359
362
365
368
370
372
375
378
381
384
387
390
394
Chapter 1: Linear Systems
25
1.3.12
1.3.13
1.3.14
1.3.15
1.3.16
1.3.17
1.3.18
1.3.19
1.3.20
Complex Band Positive Definite Matrices in Band Storage
High accuracy linear system solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSAQH
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLQH
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLQB
Factors and compute condition number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFCQH
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTQH
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSQH
High accuracy linear system solution after factoring . . . . . . . . . . . . . . . . . . . .LFIQH
Computes determinant after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFDQH
Real Sparse Linear Equation Solvers
Solves a sparse linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLXG
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTXG
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSXG
Complex Sparse Linear Equation Solvers
Solves a sparse linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLZG
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFTZG
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSZG
Real Sparse Symmetric Positive Definite Linear Equation Solvers
Solves a sparse linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLXD
Symbolic Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSCXD
Computes Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LNFXD
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSXD
Complex Sparse Hermitian Positive Definite Linear Equation Solvers
Solves a sparse linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLZD
Computes Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LNFZD
Solves a linear system after factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFSZD
396
399
402
405
408
411
414
417
419
424
429
432
437
442
446
450
454
459
463
467
471
Real Toeplitz Matrices in Toeplitz Storage
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLTO
475
Complex Toeplitz Matrices in Toeplitz Storage
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLTC
477
Complex Circulant Matrices in Circulant Storage
Solves a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSLCC
479
Iterative Methods
Preconditioned conjugate gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PCGRC
Jacobi conjugate gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .JCGRC
Generalized minimum residual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GMRES
Partial Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . ARPACK_SVD
488
482
491
502
Chapter 1: Linear Systems
26
1.4
Linear Least Squares and Matrix Factorization
1.4.1
Least Squares, QR Decomposition and Generalized Inverse
Solves a Least-squares system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LSQRR
Solves a Least-squares system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LQRRV
High accuracy Least squares. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSBRR
Linearly constrained Least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LCLSQ
QR decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LQRRR
Accumulation of QR decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LQERR
QR decomposition Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LQRSL
QR factor update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LUPQR
1.4.2
1.4.3
Cholesky Factorization
Cholesky factoring for rank deficient matrices . . . . . . . . . . . . . . . . . . . . . . . .LCHRG
Cholesky factor update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LUPCH
Cholesky factor down-date. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LDNCH
Singular Value Decomposition (SVD)
Real singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSVRR
Complex singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSVCR
Generalized inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LSGRR
503
509
516
519
523
530
535
542
546
549
552
556
563
567
Chapter 1: Linear Systems
27
Usage Notes
Section 1.1 describes routines for solving systems of linear algebraic equations by direct matrix factorization
methods, for computing only the matrix factorizations, and for computing linear least-squares solutions.
Section 1.2 describes routines for solving systems of parallel constrained least-squares.
Many of the routines described in sections 1.3 and 1.4 are for matrices with special properties or structure.
Computer time and storage requirements for solving systems with coefficient matrices of these types can
often be drastically reduced, using the appropriate routine, compared with using a routine for solving a general complex system.
The appropriate matrix property and corresponding routine can be located in the “Routines” section. Many
of the linear equation solver routines in this chapter are derived from subroutines from LINPACK, Dongarra
et al. (1979). Other routines have been developed by Visual Numerics, derived from draft versions of
LAPACK subprograms, Bischof et al. (1988), or were obtained from alternate sources.
A system of linear equations is represented by Ax = b where A is the n × n coefficient data matrix, b is the
known right-hand-side n-vector, and x is the unknown or solution n-vector. Figure 1-1 summarizes the relationships among the subroutines. Routine names are in boxes and input/output data are in ovals. The suffix
** in the subroutine names depend on the matrix type. For example, to compute the determinant of A use
LFC** or LFT** followed by LFD**.
The paths using LSA** or LFI** use iterative refinement for a more accurate solution. The path using
LSA** is the same as using LFC** followed by LFI**. The path using LSL** is the same as the path using
LFC** followed by LFS**. The matrix inversion routines LIN** are available only for certain matrix types.
Matrix Types
The two letter codes for the form of coefficient matrix, indicated by ** in Figure 1.1, are as follows:
RG
Real general (square) matrix.
CG
Complex general (square) matrix.
TR or CR
Real tridiagonal matrix.
RB
Real band matrix.
TQ or CQ
Complex tridiagonal matrix.
CB
Complex band matrix.
SF
Real symmetric matrix stored in the upper half of a square matrix.
DS
Real symmetric positive definite matrix stored in the upper half of a square matrix.
DH
Complex Hermitian positive definite matrix stored in the upper half of a complex
square matrix.
HF
Complex Hermitian matrix stored in the upper half of a complex square matrix.
QS or PB
Real symmetric positive definite band matrix.
QH or QB
Complex Hermitian positive definite band matrix.
Usage Notes
Chapter 1: Linear Systems
28
XG
Real general sparse matrix.
ZG
Complex general sparse matrix.
XD
Real symmetric positive definite sparse matrix.
ZD
Complex Hermitian positive definite sparse matrix.
Figure 1.1 — Solution and Factorization of Linear Systems
Solution of Linear Systems
The simplest routines to use for solving linear equations are LSL** and LSA**. For example, the mnemonic
for matrices of real general form is RG. So, the routines LSARG and LSLRG are appropriate to use for solving
linear systems when the coefficient matrix is of real general form. The routine LSARG uses iterative refinement, and more time than LSLRG, to determine a high accuracy solution.
The high accuracy solvers provide maximum protection against extraneous computational errors. They do
not protect the results from instability in the mathematical approximation. For a more complete discussion of
this and other important topics about solving linear equations, see Rice (1983), Stewart (1973), or Golub and
van Loan (1989).
Multiple Right Sides
There are situations where the LSL** and LSA** routines are not appropriate. For example, if the linear system has more than one right-hand-side vector, it is most economical to solve the system by first calling a
factoring routine and then calling a solver routine that uses the factors. After the coefficient matrix has been
Usage Notes
Chapter 1: Linear Systems
29
factored, the routine LFS** or LFI** can be used to solve for one right-hand side at a time. Routines LFI**
uses iterative refinement to determine a high accuracy solution but requires more computer time and storage
than routines LFS**.
Determinants
The routines for evaluating determinants are named LFD**. As indicated in Figure 1-1, these routines
require the factors of the matrix as input. The values of determinants are often badly scaled. Additional complications in structures for evaluating them result from this fact. See Rice (1983) for comments on
determinant evaluation.
Iterative Refinement
Iterative refinement can often improve the accuracy of a well-posed numerical solution. The iterative refinement algorithm used is as follows:
x0 = A-1 b
For i = 1, 50
ri = Axi-1− b computed in higher precision
pi = A-1 ri
xi = xi-1−pi
if (∥pi∥∞ ≤ Ɛ∥xi∥∞) Exit
End for
Error — Matrix is too ill-conditioned
If the matrix A is in single precision, then the residual ri = Axi-1− b is computed in double precision. If A is in
double precision, then quadruple-precision arithmetic routines are used.
The use of the value 50 is arbitrary. In fact a single correction is usually sufficient. It is also helpful even when
ri is computed in the same precision as the data.
Matrix Inversion
An inverse of the coefficient matrix can be computed directly by one of the routines named LIN**. These
routines are provided for general matrix forms and some special matrix forms. When they do not exist, or
when it is desirable to compute a high accuracy inverse, the two-step technique of calling the factoring routine followed by the solver routine can be used. The inverse is the solution of the matrix system AX = I where
I denotes the n × n identity matrix, and the solution is X = A-1.
Singularity
The numerical and mathematical notions of singularity are not the same. A matrix is considered numerically
singular if it is sufficiently close to a mathematically singular matrix. If error messages are issued regarding
an exact singularity then specific error message level reset actions must be taken to handle the error condition. By default, the routines in this chapter stop. The solvers require that the coefficient matrix be
Usage Notes
Chapter 1: Linear Systems
30
numerically nonsingular. There are some tests to determine if this condition is met. When the matrix is factored, using routines LFC**, the condition number is computed. If the condition number is large compared
to the working precision, a warning message is issued and the computations are continued. In this case, the
user needs to verify the usability of the output. If the matrix is determined to be mathematically singular, or
ill-conditioned, a least-squares routine or the singular value decomposition routine may be used for further
analysis.
Special Linear Systems
Toeplitz matrices have entries which are constant along each diagonal, for example:
$
S S S
Sí S S
Sí Sí S
Sí Sí Sí
S
S
S
S
Real Toeplitz systems can be solved using LSLTO. Complex Toeplitz systems can be solved using LSLTC.
Circulant matrices have the property that each row is obtained by shifting the row above it one place to the
right. Entries that are shifted off at the right reenter at the left. For example:
$
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
Complex circulant systems can be solved using LSLCC.
Iterative Solution of Linear Systems
The preconditioned conjugate gradient routines PCGRC and JCGRC can be used to solve symmetric positive
definite systems. The routines are particularly useful if the system is large and sparse. These routines use
reverse communication, so A can be in any storage scheme. For general linear systems, use GMRES.
QR Decomposition
The QR decomposition of a matrix A consists of finding an orthogonal matrix Q, a permutation matrix P, and
an upper trapezoidal matrix R with diagonal elements of nonincreasing magnitude, such that AP = QR. This
decomposition is determined by the routines LQRRR or LQRRV. It returns R and the information needed to
compute Q. To actually compute Q use LQERR. Figure 1.2 summarizes the relationships among the
subroutines.
The QR decomposition can be used to solve the linear system Ax = b. This is equivalent to Rx = QTPb. The
routine LQRSL, can be used to find QTPb from the information computed by LQRRR. Then x can be computed
by solving a triangular system using LSLRT. If the system Ax = b is overdetermined, then this procedure
solves the least-squares problem, i.e., it finds an x for which
Usage Notes
Chapter 1: Linear Systems
31
ӝ$[ í Eӝ
is a minimum.
If the matrix A is changed by a rank-1 update, A→A + αxyT, the QR decomposition of A can be
updated/down-dated using the routine LUPQR. In some applications a series of linear systems which differ
by rank-1 updates must be solved. Computing the QR decomposition once and then updating or down-dating it usually faster than newly solving each system.
Figure 1.2 — Least-Squares Routine
Usage Notes
Chapter 1: Linear Systems
32
LIN_SOL_GEN
more...
Solves a general system of linear equations Ax = b. Using optional arguments, any of several related computations can be performed. These extra tasks include computing the LU factorization of A using partial
pivoting, representing the determinant of A, computing the inverse matrix A-1, and solving ATx = b or Ax = b
given the LU factorization of A.
Required Arguments
A — Array of size n × n containing the matrix. (Input [/Output])
If the packaged option lin_sol_gen_save_LU is used then the LU factorization of A is saved in A.
For solving efficiency, the diagonal reciprocals of the matrix U are saved in the diagonal entries of A.
B — Array of size n × nb containing the right-hand side matrix. (Input [/Output])
If the packaged option lin_sol_gen_save_LU is used then input B is used as work storage and is
not saved.
X — Array of size n × nb containing the solution matrix.(Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size (A, 1)
NRHS = nb (Input)
Uses array b(1:n, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
pivots = pivots(:) (Output [/Input])
Integer array of size n that contains the individual row interchanges. To construct the permuted order
so that no pivoting is required, define an integer array ip(n). Initialize ip(i) = i, i = 1, n and then execute the loop, after calling lin_sol_gen,
k=pivots(i)
interchange ip(i) and ip(k), i=1,n
The matrix defined by the array assignment that permutes the rows, A(1:n, 1:n) = A(ip(1:n), 1:n),
requires no pivoting for maintaining numerical stability. Now, the optional argument “iopt=” and
the packaged option number ?_lin_sol_gen_no_pivoting can be safely used for increased efficiency during the LU factorization of A.
det = det(1:2) (Output)
Array of size 2 of the same type and kind as A for representing the determinant of the input matrix.
The determinant is represented by two numbers. The first is the base with the sign or complex angle of
LIN_SOL_GEN
Chapter 1: Linear Systems
33
the result. The second is the exponent. When det(2) is within exponent range, the value of this expression is given by abs(det(1))**det(2) * (det(1))/abs(det(1)). If the matrix is not singular,
abs(det(1)) = radix(det); otherwise, det(1) = 0, and det(2) = – huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the inverse matrix, A-1, when the input
matrix is nonsingular.
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix; used for passing optional data to the
routine. The options are as follows:
Packaged Options for lin_sol_gen
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_sol_gen_set_small
1
s_, d_, c_, z_
lin_sol_gen_save_LU
2
s_, d_, c_, z_
lin_sol_gen_solve_A
3
s_, d_, c_, z_
lin_sol_gen_solve_ADJ
4
s_, d_, c_, z_
lin_sol_gen_no_pivoting
5
s_, d_, c_, z_
lin_sol_gen_scan_for_NaN
6
s_, d_, c_, z_
lin_sol_gen_no_sing_mess
7
s_, d_, c_, z_
lin_sol_gen_A_is_sparse
8
iopt(IO) = ?_options(?_lin_sol_gen_set_small, Small)
Replaces a diagonal term of the matrix U if it is smaller in magnitude than the value Small using the
same sign or complex direction as the diagonal. The system is declared singular. A solution is approximated based on this replacement if no overflow results.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_sol_gen_save_LU, ?_dummy)
Saves the LU factorization of A. Requires the optional argument “pivots=” if the routine will be used
later for solving systems with the same matrix. This is the only case where the input arrays A and b are
not saved. For solving efficiency, the diagonal reciprocals of the matrix U are saved in the diagonal
entries of A.
iopt(IO) = ?_options(?_lin_sol_gen_solve_A, ?_dummy)
Uses the LU factorization of A computed and saved to solve Ax = b.
iopt(IO) = ?_options(?_lin_sol_gen_solve_ADJ, ?_dummy)
Uses the LU factorization of A computed and saved to solve ATx = b.
iopt(IO) = ?_options(?_lin_sol_gen_no_pivoting, ?_dummy)
Does no row pivoting. The array pivots (:), if present, are output as pivots (i) = i, for i = 1, …, n.
iopt(IO) = ?_options(?_lin_sol_gen_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) == .true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs.
iopt(IO) = ?_options(?_lin_sol_gen_no_sing_mess, ?_dummy)
Do not output an error message when the matrix A is singular.
LIN_SOL_GEN
Chapter 1: Linear Systems
34
iopt(IO) = ?_options(?_lin_sol_gen_A_is_sparse, ?_dummy)
Uses an indirect updating loop for the LU factorization that is efficient for sparse matrices where all
matrix entries are stored.
FORTRAN 90 Interface
Generic:
CALL LIN_SOL_GEN (A, B, X [, …])
Specific:
The specific interface names are S_LIN_SOL_GEN, D_LIN_SOL_GEN, C_LIN_SOL_GEN,
and Z_LIN_SOL_GEN.
Description
Routine LIN_SOL_GEN solves a system of linear algebraic equations with a nonsingular coefficient matrix A.
It first computes the LU factorization of A with partial pivoting such that LU = A. The matrix U is upper triangular, while the following is true:
/í $ Ł /Q3Q /Qí3Qí ֥ /3 $ Ł 8
The factors Pi and Li are defined by the partial pivoting. Each Pi is an interchange of row i with row j ≥ i.
Thus, Pi is defined by that value of j. Every
/L
, PLH7L
is an elementary elimination matrix. The vector PL is zero in entries 1, …, i. This vector is stored as column i in
the strictly lower-triangular part of the working array containing the decomposition information. The reciprocals of the diagonals of the matrix U are saved in the diagonal of the working array. The solution of the
linear system Ax = b is found by solving two simpler systems,
y = L-1b and x = U-1y
More mathematical details are found in Golub and Van Loan (1989, Chapter 3).
Fatal and Terminal Error Messages
See the messages.gls file for error messages for LIN_SOL_GEN. The messages are numbered 161–175; 181–195;
201–215; 221–235.
Examples
Example 1: Solving a Linear System of Equations
This example solves a linear system of equations. This is the simplest use of lin_sol_gen. The equations
are generated using a matrix of random numbers, and a solution is obtained corresponding to a random
right-hand side matrix. Also, see operator_ex01, supplied with the product examples, for this example
using the operator notation.
use lin_sol_gen_int
LIN_SOL_GEN
Chapter 1: Linear Systems
35
use rand_gen_int
use error_option_packet
implicit none
! This is Example 1 for LIN_SOL_GEN.
integer, parameter :: n=32
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) err
real(kind(1e0)) A(n,n), b(n,n), x(n,n), res(n,n), y(n**2)
! Generate a random matrix.
call rand_gen(y)
! Generate random right-hand sides.
call rand_gen(y)
b = reshape(y,(/n,n/))
! Compute the solution matrix of Ax=b.
call lin_sol_gen(A, b, x)
! Check the results for small residuals.
res = b - matmul(A,x)
err = maxval(abs(res))/sum(abs(A)+abs(b))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_GEN is correct.'
end if
end
Output
Example 1 for LIN_SOL_GEN is correct.
Example 2: Matrix Inversion and Determinant
This example computes the inverse and determinant of A, a random matrix. Tests are made on the conditions
$$í
,
and
GHW $í
GHW $
í
Also, see operator_ex02.
use lin_sol_gen_int
use rand_gen_int
implicit none
LIN_SOL_GEN
Chapter 1: Linear Systems
36
! This is Example 2 for LIN_SOL_GEN.
integer i
integer, parameter :: n=32
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1e0)) err
real(kind(1e0)) A(n,n), b(n,0), inv(n,n), x(n,0), res(n,n), &
y(n**2), determinant(2), inv_determinant(2)
! Generate a random matrix.
call rand_gen(y)
A = reshape(y,(/n,n/))
! Compute the matrix inverse and its determinant.
call lin_sol_gen(A, b, x, nrhs=0, &
ainv=inv, det=determinant)
! Compute the determinant for the inverse matrix.
call lin_sol_gen(inv, b, x, nrhs=0, &
det=inv_determinant)
! Check residuals, A times inverse = Identity.
res = matmul(A,inv)
do i=1, n
res(i,i) = res(i,i) - one
end do
err = sum(abs(res)) / sum(abs(a))
if (err <= sqrt(epsilon(one))) then
if (determinant(1) == inv_determinant(1) .and. &
(abs(determinant(2)+inv_determinant(2)) &
<= abs(determinant(2))*sqrt(epsilon(one)))) then
write (*,*) 'Example 2 for LIN_SOL_GEN is correct.'
end if
end if
end
Output
Example 2 for LIN_SOL_GEN is correct.
Example 3: Solving a System with Iterative Refinement
This example computes a factorization of a random matrix using single-precision arithmetic. The double-precision solution is corrected using iterative refinement. The corrections are added to the developing solution
until they are no longer decreasing in size. The initialization of the derived type array
iopti(1:2) = s_option(0,0.0e0) leaves the integer part of the second element of iopti(:) at the
LIN_SOL_GEN
Chapter 1: Linear Systems
37
value zero. This stops the internal processing of options inside lin_sol_gen. It results in the LU factorization being saved after exit. The next time the routine is entered the integer entry of the second element of
iopt(:) results in a solve step only. Since the LU factorization is saved in arrays A(:,:) and ipivots(:),
at the final step, solve only steps can occur in subsequent entries to lin_sol_gen. Also, see
operator_ex03, Chapter 10.
use lin_sol_gen_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SOL_GEN.
integer, parameter :: n=32
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1d0)), parameter :: d_zero=0.0d0
integer ipivots(n)
real(kind(1e0)) a(n,n), b(n,1), x(n,1), w(n**2)
real(kind(1e0)) change_new, change_old
real(kind(1d0)) c(n,1), d(n,n), y(n,1)
type(s_options) :: iopti(2)=s_options(0,zero)
! Generate a random matrix.
call rand_gen(w)
a = reshape(w, (/n,n/))
! Generate a random right hand side.
call rand_gen(b(1:n,1))
! Save double precision copies of the matrix and right hand side.
d = a
c = b
! Start solution at zero.
y = d_zero
change_old = huge(one)
! Use packaged option to save the factorization.
iopti(1) = s_options(s_lin_sol_gen_save_LU,zero)
iterative_refinement: do
b = c - matmul(d,y)
call lin_sol_gen(a, b, x, &
pivots=ipivots, iopt=iopti)
y = x + y
change_new = sum(abs(x))
! Exit when changes are no longer decreasing.
LIN_SOL_GEN
Chapter 1: Linear Systems
38
if (change_new >= change_old) &
exit iterative_refinement
change_old = change_new
! Use option to re-enter code with factorization saved; solve only.
iopti(2) = s_options(s_lin_sol_gen_solve_A,zero)
end do iterative_refinement
write (*,*) 'Example 3 for LIN_SOL_GEN is correct.'
end
Output
Example 3 for LIN_SOL_GEN is correct.
Example 4: Evaluating the Matrix Exponential
This example computes the solution of the ordinary differential equation problem
G\
GW
$\
with initial values y(0) = y0. For this example, the matrix A is real and constant with respect to t. The unique
solution is given by the matrix exponential:
H $W \
\ W
This method of solution uses an eigenvalue-eigenvector decomposition of the matrix
$
;'; í
to evaluate the solution with the equivalent formula
\ W
;H'W ]
where
]
; í \
is computed using the complex arithmetic version of lin_sol_gen. The results for y(t) are real quantities,
but the evaluation uses intermediate complex-valued calculations. Note that the computation of the complex
matrix X and the diagonal matrix D is performed using the IMSL MATH/LIBRARY FORTRAN 77 interface
to routine EVCRG. This is an illustration of intermixing interfaces of FORTRAN 77 and Fortran 90 code. The
information is made available to the Fortran 90 compiler by using the FORTRAN 77 interface for EVCRG.
Also, see operator_ex04, supplied with the product examples, where the Fortran 90 function EIG() has
replaced the call to EVCRG.
use lin_sol_gen_int
use rand_gen_int
use Numerical_Libraries
LIN_SOL_GEN
Chapter 1: Linear Systems
39
implicit none
! This is Example 4 for LIN_SOL_GEN.
integer, parameter :: n=32, k=128
real(kind(1e0)), parameter :: one=1.0e0, t_max=1, delta_t=t_max/(k-1)
real(kind(1e0)) err, A(n,n), atemp(n,n), ytemp(n**2)
real(kind(1e0)) t(k), y(n,k), y_prime(n,k)
complex(kind(1e0)) EVAL(n), EVEC(n,n)
complex(kind(1e0)) x(n,n), z_0(n,1), y_0(n,1), d(n)
integer i
! Generate a random matrix in an F90 array.
call rand_gen(ytemp)
atemp = reshape(ytemp,(/n,n/))
! Assign data to an F77 array.
A = atemp
! Use IMSL Numerical Libraries F77 subroutine for the
! eigenvalue-eigenvector calculation.
CALL EVCRG(N, A, N, EVAL, EVEC, N)
! Generate a random initial value for the ODE system.
call rand_gen(ytemp(1:n))
y_0(1:n,1) = ytemp(1:n)
! Assign the eigenvalue-eigenvector data to F90 arrays.
d = EVAL; x = EVEC
! Solve complex data system that transforms the initial values, Xz_0=y_0.
call lin_sol_gen(x, y_0, z_0)
t = (/(i*delta_t,i=0,k-1
! Compute y and y' at the values t(1:k).
y = matmul(x, exp(spread(d,2,k)*spread(t,1,n))* &
spread(z_0(1:n,1),2,k))
y_prime = matmul(x, spread(d,2,k)* &
exp(spread(d,2,k)*spread(t,1,n))* &
spread(z_0(1:n,1),2,k))
! Check results. Is y' - Ay = 0?
err = sum(abs(y_prime-matmul(atemp,y))) / &
(sum(abs(atemp))*sum(abs(y)))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_SOL_GEN is correct.'
end if
end
LIN_SOL_GEN
Chapter 1: Linear Systems
40
Output
Example 4 for LIN_SOL_GEN is c orrect.
LIN_SOL_GEN
Chapter 1: Linear Systems
41
LIN_SOL_SELF
more...
Solves a system of linear equations Ax = b, where A is a self-adjoint matrix. Using optional arguments, any of
several related computations can be performed. These extra tasks include computing and saving the factorization of A using symmetric pivoting, representing the determinant of A, computing the inverse matrix A-1,
or computing the solution of Ax = b given the factorization of A. An optional argument is provided indicating that A is positive definite so that the Cholesky decomposition can be used.
Required Arguments
A — Array of size n × n containing the self-adjoint matrix. (Input [/Output])
If the packaged option lin_sol_self_save_factors is used then the factorization of A is saved in
A. For solving efficiency, the diagonal reciprocals of the matrix R are saved in the diagonal entries of A
when the Cholesky method is used.
B — Array of size n × nb containing the right-hand side matrix. (Input [/Output])
If the packaged option lin_sol_self_save_factors is used then input B is used as work storage
and is not saved.
X — Array of size n × nb containing the solution matrix. (Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size(A, 1)
NRHS = nb (Input)
Uses the array b(1:n, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
pivots = pivots(:) (Output [/Input])
Integer array of size n + 1 that contains the individual row interchanges in the first n locations.
Applied in order, these yield the permutation matrix P. Location n + 1 contains the number of the first
diagonal term no larger than Small, which is defined on the next page of this chapter.
det = det(1:2) (Output)
Array of size 2 of the same type and kind as A for representing the determinant of the input matrix.
The determinant is represented by two numbers. The first is the base with the sign or complex angle of
the result. The second is the exponent. When det(2) is within exponent range, the value of the determinant is given by the expression abs(det(1))**det(2) * (det(1))/abs(det(1)). If the matrix is not
singular, abs(det(1)) = radix(det); otherwise, det(1) = 0, and det(2) = -huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the inverse matrix, A-1 when the input
matrix is nonsingular.
LIN_SOL_SELF
Chapter 1: Linear Systems
42
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix; used for passing optional data to the
routine. The options are as follows:
Packaged Options for lin_sol_self
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_sol_self_set_small
1
s_, d_, c_, z_
lin_sol_self_save_factors
2
s_, d_, c_, z_
lin_sol_self_no_pivoting
3
s_, d_, c_, z_
lin_sol_self_use_Cholesky
4
s_, d_, c_, z_
lin_sol_self_solve_A
5
s_, d_, c_, z_
lin_sol_self_scan_for_NaN
6
s_, d_, c_, z_
lin_sol_self_no_sing_mess
7
iopt(IO) = ?_options(?_lin_sol_self_set_small, Small)
When Aasen’s method is used, the tridiagonal system Tu = v is solved using LU factorization with partial pivoting. If a diagonal term of the matrix U is smaller in magnitude than the value Small, it is
replaced by Small. The system is declared singular. When the Cholesky method is used, the upper-triangular matrix R, (see Description), is obtained. If a diagonal term of the matrix R is smaller in
magnitude than the value Small, it is replaced by Small. A solution is approximated based on this
replacement in either case.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_sol_self_save_factors, ?_dummy)
Saves the factorization of A. Requires the optional argument “pivots=” if the routine will be used for
solving further systems with the same matrix. This is the only case where the input arrays A and b are
not saved. For solving efficiency, the diagonal reciprocals of the matrix R are saved in the diagonal
entries of A when the Cholesky method is used.
iopt(IO) = ?_options(?_lin_sol_self_no_pivoting, ?_dummy)
Does no row pivoting. The array pivots(:), if present, satisfies pivots(i) = i + 1 for i = 1, …, n – 1
when using Aasen’s method. When using the Cholesky method, pivots(i) = i for i = 1, …, n.
iopt(IO) = ?_options(?_lin_sol_self_use_Cholesky, ?_dummy)
The Cholesky decomposition PAPT = RTR is used instead of the Aasen method.
iopt(IO) = ?_options(?_lin_sol_self_solve_A, ?_dummy)
Uses the factorization of A computed and saved to solve Ax = b.
iopt(IO) = ?_options(?_lin_sol_self_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) == .true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs
iopt(IO) = ?_options(?_lin_sol_self_no_sing_mess, ?_dummy)
Do not print an error message when the matrix A is singular.
FORTRAN 90 Interface
Generic:
CALL LIN_SOL_SELF (A, B, X [, …])
LIN_SOL_SELF
Chapter 1: Linear Systems
43
Specific:
The specific interface names are S_LIN_SOL_SELF, D_LIN_SOL_SELF,
C_LIN_SOL_SELF, and Z_LIN_SOL_SELF.
Description
Routine LIN_SOL_SELF routine solves a system of linear algebraic equations with a nonsingular coefficient
matrix A. By default, the routine computes the factorization of A using Aasen’s method. This decomposition
has the form
3$3 7
/7/7
where P is a permutation matrix, L is a unit lower-triangular matrix, and T is a tridiagonal
self-adjoint matrix. The solution of the linear system Ax = b is found by solving simpler systems,
X
/í3E
Tv = u
and
[
37 /í7 Y
More mathematical details for real matrices are found in Golub and Van Loan (1989, Chapter 4).
When the optional Cholesky algorithm is used with a positive definite, self-adjoint matrix, the factorization
has the alternate form
3$3 7
57 5
where P is a permutation matrix and R is an upper-triangular matrix. The solution of the linear system
Ax = b is computed by solving the systems
X
5í7 3E
[
37 5íX
and
The permutation is chosen so that the diagonal term is maximized at each step of the decomposition. The
individual interchanges are optionally available in the argument “pivots”.
Fatal and Terminal Error Messages
See the messages.gls file for error messages for LIN_SOL_SELF. These error messages are numbered 321–
336; 341–356; 361–376; 381–396.
LIN_SOL_SELF
Chapter 1: Linear Systems
44
Examples
Example 1: Solving a Linear Least-squares System
This example solves a linear least-squares system Cx ≅ d, where Cmxn is a real matrix with m ≥ n. The leastsquares solution is computed using the self-adjoint matrix
$
&7 &
E
$7 G
and the right-hand side
The n × n self-adjoint system Ax = b is solved for x. This solution method is not as satisfactory, in terms of
numerical accuracy, as solving the system Cx ≅ d directly by using the routine lin_sol_lsq. Also, see
operator_ex05, Chapter 10.
use lin_sol_self_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_SOL_SELF.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) err
real(kind(1e0)), dimension(n,n) :: A, b, x, res, y(m*n),&
C(m,n), d(m,n)
! Generate two rectangular random matrices.
call rand_gen(y)
C = reshape(y,(/m,n/))
call rand_gen(y)
d = reshape(y,(/m,n/))
! Form the normal equations for the rectangular system.
A = matmul(transpose(C),C)
b = matmul(transpose(C),d)
! Compute the solution for Ax = b.
call lin_sol_self(A, b, x)
! Check the results for small residuals.
res = b - matmul(A,x)
err = maxval(abs(res))/sum(abs(A)+abs(b))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_SELF is correct.'
end if
end
LIN_SOL_SELF
Chapter 1: Linear Systems
45
Output
Example 1 for LIN_SOL_SELF is correct.
Example 2: System Solving with Cholesky Method
This example solves the same form of the system as Example 1. The optional argument “iopt=” is used to
note that the Cholesky algorithm is used since the matrix A is positive definite and self-adjoint. In addition,
the sample covariance matrix
ī
ı $í
is computed, where
ı
ӝG í &[ӝ
PíQ
the inverse matrix is returned as the “ainv=” optional argument. The scale factor ı and Γ are computed
after returning from the routine. Also, see operator_ex06, Chapter 10.
use lin_sol_self_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 2 for LIN_SOL_SELF.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1e0)) err
real(kind(1e0)) a(n,n), b(n,1), c(m,n), d(m,1), cov(n,n), x(n,1), &
res(n,1), y(m*n)
type(s_options) :: iopti(1)=s_options(0,zero)
! Generate a random rectangular matrix and a random right hand side.
call rand_gen(y)
c = reshape(y,(/m,n/))
call rand_gen(d(1:n,1))
! Form the normal equations for the rectangular system.
a = matmul(transpose(c),c)
b = matmul(transpose(c),d)
! Use packaged option to use Cholesky decomposition.
iopti(1) = s_options(s_lin_sol_self_Use_Cholesky,zero)
! Compute the solution of Ax=b with optional inverse obtained.
LIN_SOL_SELF
Chapter 1: Linear Systems
46
call lin_sol_self(a, b, x, ainv=cov, &
iopt=iopti)
! Compute residuals, x - (inverse)*b, for consistency check.
res = x - matmul(cov,b)
! Scale the inverse to obtain the covariance matrix.
cov = (sum((d-matmul(c,x))**2)/(m-n)) * cov
! Check the results.
err = sum(abs(res))/sum(abs(cov))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_SOL_SELF is correct.'
end if
end
Output
Example 2 for LIN_SOL_SELF is correct.
Example 3: Using Inverse Iteration for an Eigenvector
This example illustrates the use of the optional argument “iopt=” to reset the value of a Small diagonal term
encountered during the factorization. Eigenvalues of the self-adjoint matrix
$
&7 &
are computed using the routine lin_eig_self. An eigenvector, corresponding to one of these eigenvalues,
λ, is computed using inverse iteration. This solves the near singular system (A – λI)x = b for an eigenvector, x.
Following the computation of a normalized eigenvector
\
[
ӝ [ӝ
Ȝ
\7 $\
the consistency condition
is checked. Since a singular system is expected, suppress the fatal error message that normally prints when
the error post-processor routine error_post is called within the routine lin_sol_self. Also, see
operator_ex07, Chapter 10.
use
use
use
use
lin_sol_self_int
lin_eig_self_int
rand_gen_int
error_option_packet
LIN_SOL_SELF
Chapter 1: Linear Systems
47
implicit none
! This is Example 3 for LIN_SOL_SELF.
integer i, tries
integer, parameter :: m=8, n=4, k=2
integer ipivots(n+1)
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) err
real(kind(1d0)) a(n,n), b(n,1), c(m,n), x(n,1), y(m*n), &
e(n), atemp(n,n)
type(d_options) :: iopti(4)
! Generate a random rectangular matrix.
call rand_gen(y)
c = reshape(y,(/m,n/))
! Generate a random right hand side for use in the inverse
! iteration.
call rand_gen(y(1:n))
b = reshape(y,(/n,1/))
! Compute the positive definite matrix.
a = matmul(transpose(c),c)
! Obtain just the eigenvalues.
call lin_eig_self(a, e)
! Use packaged option to reset the value of a small diagonal.
iopti =
d_options(0,zero)
iopti(1) = d_options(d_lin_sol_self_set_small,&
epsilon(one) * abs(e(1)))
! Use packaged option to save the factorization.
iopti(2) = d_options(d_lin_sol_self_save_factors,zero)
! Suppress error messages and stopping due to singularity
! of the matrix, which is expected.
iopti(3) = d_options(d_lin_sol_self_no_sing_mess,zero)
atemp = a
do i=1, n
a(i,i) = a(i,i) - e(k)
end do
! Compute A-eigenvalue*I as the coefficient matrix.
do tries=1, 2
call lin_sol_self(a, b, x, &
pivots=ipivots, iopt=iopti)
! When code is re-entered, the already computed factorization
! is used.
iopti(4) = d_options(d_lin_sol_self_solve_A,zero)
! Reset right-hand side nearly in the direction of the eigenvector.
LIN_SOL_SELF
Chapter 1: Linear Systems
48
b = x/sqrt(sum(x**2))
end do
! Normalize the eigenvector.
x = x/sqrt(sum(x**2))
! Check the results.
err = dot_product(x(1:n,1),matmul(atemp(1:n,1:n),x(1:n,1))) - &
e(k)
! If any result is not accurate, quit with no summary printing.
if (abs(err) <= sqrt(epsilon(one))*e(1)) then
write (*,*) 'Example 3 for LIN_SOL_SELF is correct.'
end if
end
Output
Example 3 for LIN_SOL_SELF is correct.
Example 4: Accurate Least-squares Solution with Iterative Refinement
This example illustrates the accurate solution of the self-adjoint linear system
, $
$7 U
[
E
computed using iterative refinement. This solution method is appropriate for least-squares problems when
an accurate solution is required. The solution and residuals are accumulated in double precision, while the
decomposition is computed in single precision. Also, see operator_ex08, supplied with the product
examples.
use lin_sol_self_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SOL_SELF.
integer i
integer, parameter :: m=8, n=4
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1d0)), parameter :: d_zero=0.0d0
integer ipivots((n+m)+1)
real(kind(1e0)) a(m,n), b(m,1), w(m*n), f(n+m,n+m), &
g(n+m,1), h(n+m,1)
real(kind(1e0)) change_new, change_old
real(kind(1d0)) c(m,1), d(m,n), y(n+m,1)
type(s_options) :: iopti(2)=s_options(0,zero)
! Generate a random matrix.
LIN_SOL_SELF
Chapter 1: Linear Systems
49
call rand_gen(w)
a = reshape(w, (/m,n/))
! Generate a random right hand side.
call rand_gen(b(1:m,1))
! Save double precision copies of the matrix and right hand side.
d = a
c = b
! Fill in augmented system for accurately solving the least-squares
! problem.
f = zero
do i=1, m
f(i,i) = one
end do
f(1:m,m+1:) = a
f(m+1:,1:m) = transpose(a)
! Start solution at zero.
y = d_zero
change_old = huge(one)
! Use packaged option to save the factorization.
iopti(1) = s_options(s_lin_sol_self_save_factors,zero)
iterative_refinement: do
g(1:m,1) = c(1:m,1) - y(1:m,1) - matmul(d,y(m+1:m+n,1))
g(m+1:m+n,1) = - matmul(transpose(d),y(1:m,1))
call lin_sol_self(f, g, h, &
pivots=ipivots, iopt=iopti)
y = h + y
change_new = sum(abs(h))
! Exit when changes are no longer decreasing.
if (change_new >= change_old) &
exit iterative_refinement
change_old = change_new
! Use option to re-enter code with factorization saved; solve only.
iopti(2) = s_options(s_lin_sol_self_solve_A,zero)
end do iterative_refinement
write (*,*) 'Example 4 for LIN_SOL_SELF is correct.'
end
LIN_SOL_SELF
Chapter 1: Linear Systems
50
Output
Example 4 for LIN_SOL_SELF is correct.
LIN_SOL_SELF
Chapter 1: Linear Systems
51
LIN_SOL_LSQ
Solves a rectangular system of linear equations Ax ≅ b, in a least-squares sense. Using optional arguments,
any of several related computations can be performed. These extra tasks include computing and saving the
factorization of A using column and row pivoting, representing the determinant of A, computing the generalized inverse matrix A†, or computing the least-squares solution of
Ax ≅ b
or
ATy ≅ b,
given the factorization of A. An optional argument is provided for computing the following unscaled covariance matrix
&
$7 $
í
Least-squares solutions, where the unknowns are non-negative or have simple bounds, can be computed
with PARALLEL_NONNEGATIVE_LSQ and PARALLEL_BOUNDED_LSQ. These codes can be restricted to execute without MPI.
Required Arguments
A — Array of size m × n containing the matrix. (Input [/Output])
If the packaged option lin_sol_lsq_save_QR is used then the factorization of A is saved in A. For
efficiency, the diagonal reciprocals of the matrix R are saved in the diagonal entries of A.
B — Array of size m × nb containing the right-hand side matrix. When using the option to solve adjoint
systems ATx ≅ b, the size of b is n × nb. (Input [/Output])
If the packaged option lin_sol_lsq_save_QR is used then input B is used as work storage and is
not saved.
X — Array of size n × nb containing the right-hand side matrix. When using the option to solve adjoint
systems ATx ≅ b, the size of x is m × nb. (Output)
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size(A, 1)
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
NRHS = nb (Input)
Uses the array b(1:, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
LIN_SOL_LSQ
Chapter 1: Linear Systems
52
pivots = pivots(:) (Output [/Input])
Integer array of size 2 * min(m, n) + 1 that contains the individual row followed by the column interchanges. The last array entry contains the approximate rank of A.
trans = trans(:) (Output [/Input])
Array of size 2 * min(m, n) that contains data for the construction of the orthogonal decomposition.
det = det(1:2) (Output)
Array of size 2 of the same type and kind as A for representing the products of the determinants of the
matrices Q, P, and R. The determinant is represented by two numbers. The first is the base with the
sign or complex angle of the result. The second is the exponent. When det(2) is within exponent
range, the value of this expression is given by abs (det(1))**det(2) * (det(1))/abs(det(1)). If the
matrix is not singular, abs(det(1)) = radix(det); otherwise, det(1) = 0, and det(2) = huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array with size n × m of the same type and kind as A(1:m, 1:n). It contains the generalized inverse
matrix, A†.
cov = cov(:,:) (Output)
Array with size n × n of the same type and kind as A(1:m, 1:n). It contains the unscaled covariance
matrix, C = (ATA)-1.
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix; used for passing optional data to the
routine. The options are as follows:
Packaged Options for lin_sol_lsq
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_sol_lsq_set_small
1
s_, d_, c_, z_
lin_sol_lsq_save_QR
2
s_, d_, c_, z_
lin_sol_lsq_solve_A
3
s_, d_, c_, z_
lin_sol_lsq_solve_ADJ
4
s_, d_, c_, z_
lin_sol_lsq_no_row_pivoting
5
s_, d_, c_, z_
lin_sol_lsq_no_col_pivoting
6
s_, d_, c_, z_
lin_sol_lsq_scan_for_NaN
7
s_, d_, c_, z_
lin_sol_lsq_no_sing_mess
8
iopt(IO) = ?_options(?_lin_sol_lsq_set_small, Small)
Replaces with Small if a diagonal term of the matrix R is smaller in magnitude than the value Small. A
solution is approximated based on this replacement in either case.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_sol_lsq_save_QR, ?_dummy)
Saves the factorization of A. Requires the optional arguments “pivots=” and “trans=” if the routine
is used for solving further systems with the same matrix. This is the only case where the input arrays A
and b are not saved. For efficiency, the diagonal reciprocals of the matrix R are saved in the diagonal
entries of A.
iopt(IO) = ?_options(?_lin_sol_lsq_solve_A, ?_dummy)
Uses the factorization of A computed and saved to solve Ax = b.
LIN_SOL_LSQ
Chapter 1: Linear Systems
53
iopt(IO) = ?_options(?_lin_sol_lsq_solve_ADJ, ?_dummy)
Uses the factorization of A computed and saved to solve ATx = b.
iopt(IO) = ?_options(?_lin_sol_lsq_no_row_pivoting, ?_dummy)
Does no row pivoting. The array pivots(:), if present, satisfies pivots(i) = i for i = 1, …, min (m, n).
iopt(IO) = ?_options(?_lin_sol_lsq_no_col_pivoting, ?_dummy)
Does no column pivoting. The array pivots(:), if present, satisfies pivots(i + min (m, n)) = i for i = 1,
…, min (m, n).
iopt(IO) = ?_options(?_lin_sol_lsq_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) == .true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs
iopt(IO) = ?_options(?_lin_sol_lsq_no_sing_mess, ?_dummy)
Do not print an error message when A is singular or k < min(m, n).
FORTRAN 90 Interface
Generic:
CALL LIN_SOL_LSQ (A, B, X [, …])
Specific:
The specific interface names are S_LIN_SOL_LSQ, D_LIN_SOL_LSQ, C_LIN_SOL_LSQ,
and Z_LIN_SOL_LSQ.
Description
Routine LIN_SOL_LSQ solves a rectangular system of linear algebraic equations in a least-squares sense. It
computes the decomposition of A using an orthogonal factorization. This decomposition has the form
4$3
5NîN where the matrices Q and P are products of elementary orthogonal and permutation matrices. The matrix R
is k × k, where k is the approximate rank of A. This value is determined by the value of the parameter Small.
See Golub and Van Loan (1989, Chapter 5.4) for further details. Note that the use of both row and column
pivoting is nonstandard, but the routine defaults to this choice for enhanced reliability.
Fatal and Terminal Error Messages
See the messages.gls file for error messages for LIN_SOL_LSQ. These error messages are numbered 241–256;
261–276; 281–296; 301–316.
Examples
Example 1: Solving a Linear Least-squares System
This example solves a linear least-squares system Cx ≅ d, where
LIN_SOL_LSQ
Chapter 1: Linear Systems
54
& PîQ
is a real matrix with m > n. The least-squares problem is derived from polynomial data fitting to the function
\ [
H[ FRV ʌ [
using a discrete set of values in the interval –1 ≤ x ≤ 1. The polynomial is represented as the series
1
X [
™F 7
L L
[
L where the Ti(x) are Chebyshev polynomials. It is natural for the problem matrix and solution to have a column or entry corresponding to the subscript zero, which is used in this code. Also, see operator_ex09,
supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 1 for LIN_SOL_LSQ.
integer i
integer, parameter :: m=128, n=8
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) A(m,0:n), c(0:n,1), pi_over_2, x(m), y(m,1), &
u(m), v(m), w(m), delta_x
! Generate a random grid of points.
call rand_gen(x)
! Transform points to the interval -1,1.
x = x*2 - one
! Compute the constant 'PI/2'.
pi_over_2 = atan(one)*2
! Generate known function data on the grid.
y(1:m,1) = exp(x) + cos(pi_over_2*x)
! Fill in the least-squares matrix for the Chebyshev polynomials.
A(:,0) = one; A(:,1) = x
do i=2, n
A(:,i) = 2*x*A(:,i-1) - A(:,i-2)
end do
! Solve for the series coefficients.
call lin_sol_lsq(A, y, c)
LIN_SOL_LSQ
Chapter 1: Linear Systems
55
! Generate an
delta_x
do i=1,
x(i)
end do
equally spaced grid on the interval.
= 2/real(m-1,kind(one))
m
= -one + (i-1)*delta_x
! Evaluate residuals using backward recurrence formulas.
u = zero
v = zero
do i=n, 0, -1
w = 2*x*u - v + c(i,1)
v = u
u = w
end do
y(1:m,1) = exp(x) + cos(pi_over_2*x) - (u-x*v
! Check that n+1 sign changes in the residual curve occur.
x = one
x = sign(x,y(1:m,1))
if (count(x(1:m-1) /= x(2:m)) >= n+1) then
write (*,*) 'Example 1 for LIN_SOL_LSQ is correct.'
end if
end
Output
Example 1 for LIN_SOL_LSQ is correct.
Example 2: System Solving with the Generalized Inverse
This example solves the same form of the system as Example 1. In this case, the grid of evaluation points is
equally spaced. The coefficients are computed using the “smoothing formulas” by rows of the generalized
inverse matrix, A†, computed using the optional argument “ainv=”. Thus, the coefficients are given by the
matrix-vector product c = (A†) y, where y is the vector of values of the function y(x) evaluated at the grid of
points. Also, see operator_ex10, supplied with the product examples.
use lin_sol_lsq_int
implicit none
! This is Example 2 for LIN_SOL_LSQ.
integer i
integer, parameter :: m=128, n=8
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) a(m,0:n), c(0:n,1), pi_over_2, x(m), y(m,1), &
u(m), v(m), w(m), delta_x, inv(0:n, m)
! Generate an array of equally spaced points on the interval -1,1.
LIN_SOL_LSQ
Chapter 1: Linear Systems
56
delta_x = 2/real(m-1,kind(one))
do i=1, m
x(i) = -one + (i-1)*delta_x
end do
! Compute the constant 'PI/2'.
pi_over_2 = atan(one)*2
! Compute data values on the grid.
y(1:m,1) = exp(x) + cos(pi_over_2*x)
! Fill in the least-squares matrix for the Chebyshev polynomials.
a(:,0) = one
a(:,1) = x
do i=2, n
a(:,i) = 2*x*a(:,i-1) - a(:,i-2)
end do
! Compute the generalized inverse of the least-squares matrix.
call lin_sol_lsq(a, y, c, nrhs=0, ainv=inv)
! Compute the series coefficients using the generalized inverse
! as 'smoothing formulas.'
c(0:n,1) = matmul(inv(0:n,1:m),y(1:m,1))
! Evaluate residuals using backward recurrence formulas.
u = zero
v = zero
do i=n, 0, -1
w = 2*x*u - v + c(i,1)
v = u
u = w
end do
y(1:m,1) = exp(x) + cos(pi_over_2*x) - (u-x*v)
! Check that n+2 sign changes in the residual curve occur.
! (This test will fail when n is larger.)
x = one
x = sign(x,y(1:m,1))
if (count(x(1:m-1) /= x(2:m)) == n+2) then
write (*,*) 'Example 2 for LIN_SOL_LSQ is correct.'
end if
end
LIN_SOL_LSQ
Chapter 1: Linear Systems
57
Output
Example 2 for LIN_SOL_LSQ is correct.
Example 3: Two-Dimensional Data Fitting
This example illustrates the use of radial-basis functions to least-squares fit arbitrarily spaced data points. Let
m data values {yi} be given at points in the unit square, {pi}. Each pi is a pair of real values. Then, n points {qj}
are chosen on the unit square. A series of radial-basis functions is used to represent the data,
Q
™
F M ӝ S í T Mӝ į
I S
M where δ2 is a parameter. This example uses δ2 = 1, but either larger or smaller values can give a better
approximation for user problems. The coefficients {cj} are obtained by solving the following m × n linear
least-squares problem:
I SM
\M
This example illustrates an effective use of Fortran 90 array operations to eliminate many details required to
build the matrix and right-hand side for the {cj} . For this example, the two sets of points {pi} and {qj} are chosen randomly. The values {yj} are computed from the following formula:
íӛӛ S Mӛӛ
\M
H
The residual function
U S
íӛӛ Sӛӛ
H
í I S
is computed at an N × N square grid of equally spaced points on the unit square. The magnitude of r(p) may
be larger at certain points on this grid than the residuals at the given points, {pi}. Also, see operator_ex11,
supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SOL_LSQ.
integer i, j
integer, parameter :: m=128, n=32, k=2, n_eval=16
real(kind(1d0)), parameter :: one=1.0d0, delta_sqr=1.0d0
real(kind(1d0)) a(m,n), b(m,1), c(n,1), p(k,m), q(k,n), &
x(k*m), y(k*n), t(k,m,n), res(n_eval,n_eval), &
w(n_eval), delta
LIN_SOL_LSQ
Chapter 1: Linear Systems
58
! Generate a random set of data points in k=2 space.
call rand_gen(x)
p = reshape(x,(/k,m/))
! Generate a random set of center points in k-space.
call rand_gen(y)
q = reshape(y,(/k,n/))
! Compute the coefficient matrix for the least-squares system.
t = spread(p,3,n)
do j=1, n
t(1:,:,j) = t(1:,:,j) - spread(q(1:,j),2,m)
end do
a = sqrt(sum(t**2,dim=1) + delta_sqr)
! Compute the right hand side of data values.
b(1:,1) = exp(-sum(p**2,dim=1))
! Compute the solution.
call lin_sol_lsq(a, b, c)
! Check the results.
if (sum(abs(matmul(transpose(a),b-matmul(a,c))))/sum(abs(a)) &
<= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_SOL_LSQ is correct.'
end if
! Evaluate residuals, known function - approximation at a square
! grid of points. (This evaluation is only for k=2.)
delta = one/real(n_eval-1,kind(one))
do i=1, n_eval
w(i) = (i-1)*delta
end do
res = exp(-(spread(w,1,n_eval)**2 + spread(w,2,n_eval)**2))
do j=1, n
res = res - c(j,1)*sqrt((spread(w,1,n_eval) - q(1,j))**2 + &
(spread(w,2,n_eval) - q(2,j))**2 + delta_sqr)
end do
end
Output
Example 3 for LIN_SOL_LSQ is correct.
LIN_SOL_LSQ
Chapter 1: Linear Systems
59
Example 4: Least-squares with an Equality Constraint
This example solves a least-squares system Ax ≅ b with the constraint that the solution values have a sum
equal to the value 1. To solve this system, one heavily weighted row vector and right-hand side component is
added to the system corresponding to this constraint. Note that the weight used is
ѓí
where Ɛ is the machine precision, but any larger value can be used. The fact that lin_sol_lsq performs
row pivoting in this case is critical for obtaining an accurate solution to the constrained problem solved using
weighting. See Golub and Van Loan (1989, Chapter 12) for more information about this method. Also, see
operator_ex12, supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SOL_LSQ.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1.0e0
real(kind(1e0)) :: a(m+1,n), b(m+1,1), x(n,1), y(m*n)
! Generate a random matrix.
call rand_gen(y)
a(1:m,1:n) = reshape(y,(/m,n
! Generate a random right hand side.
call rand_gen(b(1:m,1))
! Heavily weight desired constraint.
All variables sum to one.
a(m+1,1:n) = one/sqrt(epsilon(one))
b(m+1,1) = one/sqrt(epsilon(one))
call lin_sol_lsq(a, b, x)
if (abs(sum(x) - one)/sum(abs(x)) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_SOL_LSQ is correct.'
end if
end
Output
Example 4 for LIN_SOL_LSQ is correct.
LIN_SOL_LSQ
Chapter 1: Linear Systems
60
LIN_SOL_SVD
Solves a rectangular least-squares system of linear equations Ax ≅ b using singular value decomposition
$
869 7
With optional arguments, any of several related computations can be performed. These extra tasks include
computing the rank of A, the orthogonal m × m and n × n matrices U and V, and the m × n diagonal matrix
of singular values, S.
Required Arguments
A — Array of size m × n containing the matrix. (Input [/Output])
If the packaged option lin_sol_svd_overwrite_input is used, this array is not saved on output.
B — Array of size m × nb containing the right-hand side matrix. (Input [/Output]
If the packaged option lin_sol_svd_overwrite_input is used, this array is not saved on output.
X— Array of size n × nb containing the solution matrix. (Output)
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size (A, 1)
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
NRHS = nb (Input)
Uses the array b(1:, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
RANK = k (Output)
Number of singular values that are at least as large as the value Small. It will satisfy k <= min(m, n).
u = u(:,:) (Output)
Array of the same type and kind as A(1:m, 1:n). It contains the m × m orthogonal matrix U of the singular value decomposition.
s = s(:) (Output)
Array of the same precision as A(1:m, 1:n). This array is real even when the matrix data is complex. It
contains the m × n diagonal matrix S in a rank-1 array. The singular values are nonnegative and
ordered non-increasing.
v = v(:,:) (Output)
Array of the same type and kind as A(1:m, 1:n). It contains the n × n orthogonal matrix V.
LIN_SOL_SVD
Chapter 1: Linear Systems
61
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing optional data to the
routine. The options are as follows:
Packaged Options for lin_sol_svd
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_sol_svd_set_small
1
s_, d_, c_, z_
lin_sol_svd_overwrite_input
2
s_, d_, c_, z_
lin_sol_svd_safe_reciprocal
3
s_, d_, c_, z_
lin_sol_svd_scan_for_NaN
4
iopt(IO) = ?_options(?_lin_sol_svd_set_small, Small)
Replaces with zero a diagonal term of the matrix S if it is smaller in magnitude than the value Small.
This determines the approximate rank of the matrix, which is returned as the “rank=” optional argument. A solution is approximated based on this replacement.
Default: the smallest number that can be safely reciprocated
iopt(IO) = ?_options(?_lin_sol_svd_overwrite_input, ?_dummy)
Does not save the input arrays A(:,:) and b(:,:).
iopt(IO) = ?_options(?_lin_sol_svd_safe_reciprocal, safe)
Replaces a denominator term with safe if it is smaller in magnitude than the value safe.
Default: the smallest number that can be safely reciprocated
iopt(IO) = ?_options(?_lin_sol_svd_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) ==.true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs
FORTRAN 90 Interface
Generic:
CALL LIN_SOL_SVD (A, B, X [, …])
Specific:
The specific interface names are S_LIN_SOL_SVD, D_LIN_SOL_SVD, C_LIN_SOL_SVD,
and Z_LIN_SOL_SVD.
Description
Routine LIN_SOL_SVD solves a rectangular system of linear algebraic equations in a least-squares sense. It
computes the factorization of A known as the singular value decomposition. This decomposition has the following form:
A = USVT
The matrices U and V are orthogonal. The matrix S is diagonal with the diagonal terms non-increasing. See
Golub and Van Loan (1989, Chapters 5.4 and 5.5) for further details.
LIN_SOL_SVD
Chapter 1: Linear Systems
62
Fatal, Terminal, and Warning Error Messages
See the messages.gls file for error messages for LIN_SOL_SVD. These error messages are numbered 401–412;
421–432; 441–452; 461–472.
Examples
Example 1: Least-squares solution of a Rectangular System
The least-squares solution of a rectangular m × n system Ax ≅ b is obtained. The use of lin_sol_lsq is
more efficient in this case since the matrix is of full rank. This example anticipates a problem where the
matrix A is poorly conditioned or not of full rank; thus, lin_sol_svd is the appropriate routine. Also, see
operator_ex13, in Chapter 10.
use lin_sol_svd_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_SOL_SVD.
integer, parameter :: m=128, n=32
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) A(m,n), b(m,1), x(n,1), y(m*n), err
! Generate a random matrix and right-hand side.
call rand_gen(y)
A = reshape(y,(/m,n/))
call rand_gen(b(1:m,1))
! Compute the least-squares solution matrix of Ax=b.
call lin_sol_svd(A, b, x)
! Check that the residuals are orthogonal to the
! column vectors of A.
err = sum(abs(matmul(transpose(A),b-matmul(A,x))))/sum(abs(A))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_SVD is correct.'
end if
end
Output
Example 1 for LIN_SOL_SVD is correct.
Example 2: Polar Decomposition of a Square Matrix
A polar decomposition of an n × n random matrix is obtained. This decomposition satisfies A = PQ, where P
is orthogonal and Q is self-adjoint and positive definite.
LIN_SOL_SVD
Chapter 1: Linear Systems
63
Given the singular value decomposition
$
869 7
the polar decomposition follows from the matrix products
3
89 7 DQG 4
969 7
This example uses the optional arguments “u=”, “s=”, and “v=”, then array intrinsic functions to calculate P
and Q. Also, see operator_ex14, in Chapter 10.
use lin_sol_svd_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_SOL_SVD.
integer i
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) a(n,n), b(n,0), ident(n,n), p(n,n), q(n,n), &
s_d(n), u_d(n,n), v_d(n,n), x(n,0), y(n*n)
! Generate a random matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
! Compute the singular value decomposition.
call lin_sol_svd(a, b, x, nrhs=0, s=s_d, &
u=u_d, v=v_d)
! Compute the (left) orthogonal factor.
p = matmul(u_d,transpose(v_d))
! Compute the (right) self-adjoint factor.
q = matmul(v_d*spread(s_d,1,n),transpose(v_d))
ident=zero
do i=1, n
ident(i,i) = one
end do
! Check the results.
if (sum(abs(matmul(p,transpose(p)) - ident))/sum(abs(p)) &
<= sqrt(epsilon(one))) then
if (sum(abs(a - matmul(p,q)))/sum(abs(a)) &
<= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_SOL_SVD is correct.'
LIN_SOL_SVD
Chapter 1: Linear Systems
64
end if
end if
end
Output
Example 2 for LIN_SOL_SVD is correct.
Example 3: Reduction of an Array of Black and White
An n × n array A contains entries that are either 0 or 1. The entry is chosen so that as a two-dimensional
object with origin at the point (1, 1), the array appears as a black circle of radius n/4 centered at the point
(n/2, n/2).
A singular value decomposition
$
869 7
is computed, where S is of low rank. Approximations using fewer of these nonzero singular values and vectors suffice to reconstruct A. Also, see operator_ex15, supplied with the product examples.
use lin_sol_svd_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 3 for LIN_SOL_SVD.
integer i, j, k
integer, parameter :: n=32
real(kind(1e0)), parameter :: half=0.5e0, one=1e0, zero=0e0
real(kind(1e0)) a(n,n), b(n,0), x(n,0), s(n), u(n,n), &
v(n,n), c(n,n)
! Fill in value one for points inside the circle.
a = zero
do i=1, n
do j=1, n
if ((i-n/2)**2 + (j-n/2)**2 <= (n/4)**2) a(i,j) = one
end do
end do
! Compute the singular value decomposition.
call lin_sol_svd(a, b, x, nrhs=0,&
s=s, u=u, v=v)
! How many terms, to the nearest integer, exactly
! match the circle?
c = zero; k = count(s > half)
do i=1, k
c = c + spread(u(1:n,i),2,n)*spread(v(1:n,i),1,n)*s(i)
LIN_SOL_SVD
Chapter 1: Linear Systems
65
if (count(int(c-a) /= 0) == 0) exit
end do
if (i < k) then
write (*,*) 'Example 3 for LIN_SOL_SVD is correct.'
end if
end
Output
Example 3 for LIN_SOL_SVD is correct.
Example 4: Laplace Transform Solution
This example illustrates the solution of a linear least-squares system where the matrix is poorly conditioned.
The problem comes from solving the integral equation:
œH
íVW
Ví í HíV
I W GW
J V
The unknown function f(t) = 1 is computed. This problem is equivalent to the numerical inversion of the
Laplace Transform of the function g(s) using real values of t and s, solving for a function that is nonzero only
on the unit interval. The evaluation of the integral uses the following approximate integration rule:
œI
W HíVWGW
The points
™I
WM
œH
íVW
GW
WM
M W M are chosen equally spaced by using the following:
WM
The points
equations
W M
Q
Mí
Q
V M are computed so that the range of g(s) is uniformly sampled. This requires the solution of m
J VL
JL
L
P
for j = 1, …, n and i = 1, …, m. Fortran 90 array operations are used to solve for the collocation points
a single series of steps. Newton's method,
VL as
V ĸ V í KKƍ
is applied to the array function
LIN_SOL_SVD
Chapter 1: Linear Systems
66
HíV VJ í K V
where the following is true:
J
>J « [email protected]
Note the coefficient matrix for the solution values
I
> I W « I [email protected]
whose entry at the intersection of row i and column j is equal to the value
W M
íVLW
œH
GW
WM
is explicitly integrated and evaluated as an array operation. The solution analysis of the resulting linear leastsquares system
$I ӽ J
is obtained by computing the singular value decomposition
$
869 7
An approximate solution is computed with the transformed right-hand side
E
87J
followed by using as few of the largest singular values as possible to minimize the following squared error
residual:
Q
™ í I
M
M This determines an optimal value k to use in the approximate solution
N
I
YM
EMVM
™
M Also, see operator_ex16, supplied with the product examples.
use lin_sol_svd_int
use rand_gen_int
use error_option_packet
implicit none
LIN_SOL_SVD
Chapter 1: Linear Systems
67
! This is Example 4 for LIN_SOL_SVD.
integer i, j, k
integer, parameter :: m=64, n=16
real(kind(1e0)), parameter :: one=1e0, zero=0.0e0
real(kind(1e0)) :: g(m), s(m), t(n+1), a(m,n), b(m,1), &
f(n,1), U_S(m,m), V_S(n,n), S_S(n), &
rms, oldrms
real(kind(1e0)) :: delta_g, delta_t
delta_g = one/real(m+1,kind(one))
! Compute which collocation equations to solve.
do i=1,m
g(i)=i*delta_g
end do
! Compute equally spaced quadrature points.
delta_t =one/real(n,kind(one))
do j=1,n+1
t(j)=(j-1)*delta_t
end do
! Compute collocation points.
s=m
solve_equations: do
s=s-(exp(-s)-(one-s*g))/(g-exp(-s))
if (sum(abs((one-exp(-s))/s - g)) <= &
epsilon(one)*sum(g)) &
exit solve_equations
end do solve_equations
! Evaluate the integrals over the quadrature points.
a = (exp(-spread(t(1:n),1,m)*spread(s,2,n)) &
- exp(-spread(t(2:n+1),1,m)*spread(s,2,n))) / &
spread(s,2,n)
b(1:,1)=g
! Compute the singular value decomposition.
call lin_sol_svd(a, b, f, nrhs=0, &
rank=k, u=U_S, v=V_S, s=S_S)
! Singular values that are larger than epsilon determine
! the rank=k.
k = count(S_S > epsilon(one))
oldrms = huge(one)
g = matmul(transpose(U_S), b(1:m,1))
! Find the minimum number of singular values that gives a good
! approximation to f(t) = 1.
do i=1,k
LIN_SOL_SVD
Chapter 1: Linear Systems
68
f(1:n,1) = matmul(V_S(1:,1:i), g(1:i)/S_S(1:i))
f = f - one
rms = sum(f**2)/n
if (rms > oldrms) exit
oldrms = rms
end do
write (*,"( ' Using this number of singular values, ', &
&i4 / ' the approximate R.M.S. error is ', 1pe12.4)") &
i-1, oldrms
if (sqrt(oldrms) <= delta_t**2) then
write (*,*) 'Example 4 for LIN_SOL_SVD is correct.'
end if
end
Output
Example 4 for LIN_SOL_SVD is correct.
LIN_SOL_SVD
Chapter 1: Linear Systems
69
LIN_SOL_TRI
Solves multiple systems of linear equations
Ajxj = yj, j = 1, …, k
Each matrix Aj is tridiagonal with the same dimension, n. The default solution method is based on LU factorization computed using cyclic reduction or, optionally, Gaussian elimination with partial pivoting.
Required Arguments
C — Array of size 2n × k containing the upper diagonals of the matrices Aj. Each upper diagonal is entered
in array locations c(1:n – 1, j). The data C(n, 1:k) are not used. (Input [/Output])
The input data is overwritten. See note below.
D — Array of size 2n × k containing the diagonals of the matrices Aj. Each diagonal is entered in array
locations D(1:n, j). (Input [/Output])
The input data is overwritten. See note below.
B — Array of size 2n × k containing the lower diagonals of the matrices Aj. Each lower diagonal is entered
in array locations B(2:n, j). The data B(1, 1:k) are not used. (Input [/Output])
The input data is overwritten. See note below.
Y — Array of size 2n × k containing the right-hand sides, yj. Each right-hand side is entered in array locations Y(1:n, j). The computed solution xj is returned in locations Y(1:n, j). (Input [/Output])
NOTE: The required arguments have the Input data overwritten. If these quantities are used later, they
must be saved in user-defined arrays. The routine uses each array's locations (n + 1:2 * n, 1:k) for scratch
storage and intermediate data in the LU factorization. The default values for problem dimensions are
n = (size (D, 1))/2 and k = size (D, 2).
Optional Arguments
NCOLS = n (Input)
Uses arrays C(1:n – 1, 1:k), D(1:n, 1:k), and B(2:n, 1:k) as the upper, main and lower diagonals for
the input tridiagonal matrices. The right-hand sides and solutions are in array Y(1:n, 1:k). Note that
each of these arrays are rank-2.
Default: n = (size(D, 1))/2
NPROB = k (Input)
The number of systems solved.
Default: k = size(D, 2)
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing optional data to the
routine. The options are as follows:
Packaged Options for LIN_SOL_TRI
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_sol_tri_set_small
1
s_, d_, c_, z_
lin_sol_tri_set_jolt
2
LIN_SOL_TRI
Chapter 1: Linear Systems
70
Packaged Options for LIN_SOL_TRI
s_, d_, c_, z_
lin_sol_tri_scan_for_NaN
3
s_, d_, c_, z_
lin_sol_tri_factor_only
4
s_, d_, c_, z_
lin_sol_tri_solve_only
5
s_, d_, c_, z_
lin_sol_tri_use_Gauss_elim
6
iopt(IO) = ?_options(?_lin_sol_tri_set_small, Small)
Whenever a reciprocation is performed on a quantity smaller than Small, it is replaced by that value
plus 2 × jolt.
Default: 0.25 × epsilon()
iopt(IO) = ?_options(?_lin_sol_tri_set_jolt, jolt)
Default: epsilon(), machine precision
iopt(IO) = ?_options(?_lin_sol_tri_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(C(i,j)) .or.
isNaN(D(i,j)) .or.
isNaN(B(i,j)) .or.
isNaN(Y(i,j)) == .true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs.
iopt(IO) = ?_options(?_lin_sol_tri_factor_only, ?_dummy)
Obtain the LU factorization of the matrices Aj. Does not solve for a solution.
Default: Factor the matrices and solve the systems.
iopt(IO) = ?_options(?_lin_sol_tri_solve_only, ?_dummy)
Solve the systems Ajxj = yj using the previously computed LU factorization.
Default: Factor the matrices and solve the systems.
iopt(IO) = ?_options(?_lin_sol_tri_use_Gauss_elim, ?_dummy)
The accuracy, numerical stability or efficiency of the cyclic reduction algorithm may be inferior to the
use of LU factorization with partial pivoting.
Default: Use cyclic reduction to compute the factorization.
FORTRAN 90 Interface
Generic:
CALL LIN_SOL_TRI (C, D, B, Y [, …])
Specific:
The specific interface names are S_LIN_SOL_TRI, D_LIN_SOL_TRI, C_LIN_SOL_TRI,
and Z_LIN_SOL_TRI.
Description
Routine lin_sol_tri solves k systems of tridiagonal linear algebraic equations, each problem of dimension
n × n. No relation between k and n is required. See Kershaw, pages 86–88 in Rodrigue (1982) for further
details. To deal with poorly conditioned or singular systems, a specific regularizing term is added to each
reciprocated value. This technique keeps the factorization process efficient and avoids exceptions from overflow or division by zero. Each occurrence of an array reciprocal a-1 is replaced by the expression (a + t)-1,
LIN_SOL_TRI
Chapter 1: Linear Systems
71
where the array temporary t has the value 0 whenever the corresponding entry satisfies |a| > Small. Alternately, t has the value 2 × jolt. (Every small denominator gives rise to a finite “jolt”.) Since this tridiagonal
solver is used in the routines lin_svd and lin_eig_self for inverse iteration, regularization is required.
Users can reset the values of Small and jolt for their own needs. Using the default values for these parameters,
it is generally necessary to scale the tridiagonal matrix so that the maximum magnitude has value approximately one. This is normally not an issue when the systems are nonsingular.
The routine is designed to use cyclic reduction as the default method for computing the LU factorization.
Using an optional parameter, standard elimination and partial pivoting will be used to compute the factorization. Partial pivoting is numerically stable but is likely to be less efficient than cyclic reduction.
Fatal, Terminal, and Warning Error Messages
See the messages.gls file for error messages for LIN_SOL_TRI. These error messages are numbered 1081–1086;
1101–1106; 1121–1126; 1141–1146.
Examples
Example 1: Solution of Multiple Tridiagonal Systems
The upper, main and lower diagonals of n systems of size n × n are generated randomly. A scalar is added to
the main diagonal so that the systems are positive definite. A random vector xj is generated and right-hand
sides yj = Aj yj are computed. The routine is used to compute the solution, using the Aj and yj. The results
should compare closely with the xj used to generate the right-hand sides. Also, see operator_ex17, supplied with the product examples.
use lin_sol_tri_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 1 for LIN_SOL_TRI.
integer i
integer, parameter :: n=128
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) err
real(kind(1d0)), dimension(2*n,n) :: d, b, c, res(n,n), &
t(n), x, y
!
!
!
!
Generate the upper, main, and lower diagonals of the
n matrices A_i. For each system a random vector x is used
to construct the right-hand side, Ax = y. The lower part
of each array remains zero as a result.
c = zero; d=zero; b=zero; x=zero
do i = 1, n
call rand_gen (c(1:n,i))
call rand_gen (d(1:n,i))
LIN_SOL_TRI
Chapter 1: Linear Systems
72
call rand_gen (b(1:n,i))
call rand_gen (x(1:n,i))
end do
! Add scalars to the main diagonal of each system so that
! all systems are positive definite.
t = sum(c+d+b,DIM=1)
d(1:n,1:n) = d(1:n,1:n) + spread(t,DIM=1,NCOPIES=n)
! Set Ax = y. The vector x generates y. Note the use
! of EOSHIFT and array operations to compute the matrix
! product, n distinct ones as one array operation.
y(1:n,1:n)=d(1:n,1:n)*x(1:n,1:n) + &
c(1:n,1:n)*EOSHIFT(x(1:n,1:n),SHIFT=+1,DIM=1) + &
b(1:n,1:n)*EOSHIFT(x(1:n,1:n),SHIFT=-1,DIM=1)
! Compute the solution returned in y. (The input values of c,
! d, b, and y are overwritten by lin_sol_tri.) Check for any
! error messages.
call lin_sol_tri (c, d, b, y)
! Check the size of the residuals, y-x. They should be small,
! relative to the size of values in x.
res = x(1:n,1:n) - y(1:n,1:n)
err = sum(abs(res)) / sum(abs(x(1:n,1:n)))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_TRI is correct.'
end if
end
Output
Example 1 for LIN_SOL_TRI is correct.
Example 2: Iterative Refinement and Use of Partial Pivoting
This program unit shows usage that typically gives acceptable accuracy for a large class of problems. Our
goal is to use the efficient cyclic reduction algorithm when possible, and keep on using it unless it will fail. In
exceptional cases our program switches to the LU factorization with partial pivoting. This use of both factorization and solution methods enhances reliability and maintains efficiency on the average. Also, see
operator_ex18, supplied with the product examples.
use lin_sol_tri_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_SOL_TRI.
integer i, nopt
LIN_SOL_TRI
Chapter 1: Linear Systems
73
integer, parameter :: n=128
real(kind(1e0)), parameter :: s_one=1e0, s_zero=0e0
real(kind(1d0)), parameter :: d_one=1d0, d_zero=0d0
real(kind(1e0)), dimension(2*n,n) :: d, b, c, res(n,n), &
x, y
real(kind(1e0)) change_new, change_old, err
type(s_options) :: iopt(2) = s_options(0,s_zero)
real(kind(1d0)), dimension(n,n) :: d_save, b_save, c_save, &
x_save, y_save, x_sol
logical solve_only
c = s_zero; d=s_zero; b=s_zero; x=s_zero
! Generate the upper, main, and lower diagonals of the
! matrices A. A random vector x is used to construct the
! right-hand sides: y=A*x.
do i = 1, n
call rand_gen (c(1:n,i))
call rand_gen (d(1:n,i))
call rand_gen (b(1:n,i))
call rand_gen (x(1:n,i))
end do
! Save double precision copies of the diagonals and the
! right-hand side.
c_save = c(1:n,1:n); d_save = d(1:n,1:n)
b_save = b(1:n,1:n); x_save = x(1:n,1:n)
y_save(1:n,1:n) = d(1:n,1:n)*x_save + &
c(1:n,1:n)*EOSHIFT(x_save,SHIFT=+1,DIM=1) + &
b(1:n,1:n)*EOSHIFT(x_save,SHIFT=-1,DIM=1)
! Iterative refinement loop.
factorization_choice: do nopt=0, 1
! Set the logical to flag the first time through.
solve_only = .false.
x_sol = d_zero
change_old = huge(s_one)
iterative_refinement:
do
! This flag causes a copy of data to be moved to work arrays
! and a factorization and solve step to be performed.
if (.not. solve_only) then
c(1:n,1:n)=c_save; d(1:n,1:n)=d_save
b(1:n,1:n)=b_save
end if
! Compute current residuals, y - A*x, using current x.
y(1:n,1:n) = -y_save + &
d_save*x_sol + &
c_save*EOSHIFT(x_sol,SHIFT=+1,DIM=1) + &
LIN_SOL_TRI
Chapter 1: Linear Systems
74
b_save*EOSHIFT(x_sol,SHIFT=-1,DIM=1)
call lin_sol_tri (c, d, b, y, iopt=iopt)
x_sol = x_sol + y(1:n,1:n)
change_new = sum(abs(y(1:n,1:n)))
! If size of change is not decreasing, stop the iteration.
if (change_new >= change_old) exit iterative_refinement
change_old = change_new
iopt(nopt+1) = s_options(s_lin_sol_tri_solve_only,s_zero)
solve_only = .true.
end do iterative_refinement
! Use Gaussian Elimination if Cyclic Reduction did not get an
! accurate solution.
! It is an exceptional event when Gaussian Elimination is required.
if (sum(abs(x_sol - x_save)) / sum(abs(x_save)) &
<= sqrt(epsilon(d_one))) exit factorization_choice
iopt = s_options(0,s_zero)
iopt(nopt+1) = s_options(s_lin_sol_tri_use_Gauss_elim,s_zero)
end do factorization_choice
! Check on accuracy of solution.
res = x(1:n,1:n)- x_save
err = sum(abs(res)) / sum(abs(x_save))
if (err <= sqrt(epsilon(d_one))) then
write (*,*) 'Example 2 for LIN_SOL_TRI is correct.'
end if
end
Output
Example 2 for LIN_SOL_TRI is correct.
Example 3: Eigenvectors of Tridiagonal Matrices
The eigenvalues λ1, .… λn of a tridiagonal real, self-adjoint matrix are computed. Note that the computation
is performed using the IMSL MATH/LIBRARY FORTRAN 77 interface to routine EVASB. The user may
write this interface based on documentation of the arguments (IMSL 2003, p. 480), or use the module Numerical_Libraries as we have done here. The eigenvectors corresponding to k < n of the eigenvalues are required.
These vectors are computed using inverse iteration for all the eigenvalues at one step. See Golub and Van
Loan (1989, Chapter 7). The eigenvectors are then orthogonalized. Also, see operator_ex19, supplied with
the product examples.
LIN_SOL_TRI
Chapter 1: Linear Systems
75
use lin_sol_tri_int
use rand_gen_int
use Numerical_Libraries
implicit none
! This is Example 3 for LIN_SOL_TRI.
integer i, j, nopt
integer, parameter :: n=128, k=n/4, ncoda=1, lda=2
real(kind(1e0)), parameter :: s_one=1e0, s_zero=0e0
real(kind(1e0)) A(lda,n), EVAL(k)
type(s_options) :: iopt(2)=s_options(0,s_zero)
real(kind(1e0)) d(n), b(n), d_t(2*n,k), c_t(2*n,k), perf_ratio, &
b_t(2*n,k), y_t(2*n,k), eval_t(k), res(n,k), temp
logical small
! This flag is used to get the k largest eigenvalues.
small = .false.
! Generate the main diagonal and the co-diagonal of the
! tridiagonal matrix.
call rand_gen (b)
call rand_gen (d)
A(1,1:)=b; A(2,1:)=d
! Use Numerical Libraries routine for the calculation of k
! largest eigenvalues.
CALL EVASB (N, K, A, LDA, NCODA, SMALL, EVAL)
EVAL_T = EVAL
! Use DNFL tridiagonal solver for inverse iteration
! calculation of eigenvectors.
factorization_choice: do nopt=0,1
! Create k tridiagonal problems, one for each inverse
! iteration system.
b_t(1:n,1:k) = spread(b,DIM=2,NCOPIES=k)
c_t(1:n,1:k) = EOSHIFT(b_t(1:n,1:k),SHIFT=1,DIM=1)
d_t(1:n,1:k) = spread(d,DIM=2,NCOPIES=k) - &
spread(EVAL_T,DIM=1,NCOPIES=n)
! Start the right-hand side at random values, scaled downward
! to account for the expected 'blowup' in the solution.
do i=1, k
call rand_gen (y_t(1:n,i))
end do
! Do two iterations for the eigenvectors.
do i=1, 2
LIN_SOL_TRI
Chapter 1: Linear Systems
76
y_t(1:n,1:k) = y_t(1:n,1:k)*epsilon(s_one)
call lin_sol_tri(c_t, d_t, b_t, y_t, &
iopt=iopt)
iopt(nopt+1) = s_options(s_lin_sol_tri_solve_only,s_zero)
end do
! Orthogonalize the eigenvectors. (This is the most
! intensive part of the computing.)
do j=1,k-1 ! Forward sweep of HMGS orthogonalization.
temp=s_one/sqrt(sum(y_t(1:n,j)**2))
y_t(1:n,j)=y_t(1:n,j)*temp
y_t(1:n,j+1:k)=y_t(1:n,j+1:k)+ &
spread(-matmul(y_t(1:n,j),y_t(1:n,j+1:k)), &
DIM=1,NCOPIES=n)* spread(y_t(1:n,j),DIM=2,NCOPIES=k-j)
end do
temp=s_one/sqrt(sum(y_t(1:n,k)**2))
y_t(1:n,k)=y_t(1:n,k)*temp
do j=k-1,1,-1 ! Backward sweep of HMGS.
y_t(1:n,j+1:k)=y_t(1:n,j+1:k)+ &
spread(-matmul(y_t(1:n,j),y_t(1:n,j+1:k)), &
DIM=1,NCOPIES=n)* spread(y_t(1:n,j),DIM=2,NCOPIES=k-j)
end do
!
!
!
!
See if the performance ratio is smaller than the value one.
If it is not the code will re-solve the systems using Gaussian
Elimination. This is an exceptional event. It is a necessary
complication for achieving reliable results.
res(1:n,1:k) = spread(d,DIM=2,NCOPIES=k)*y_t(1:n,1:k) + &
spread(b,DIM=2,NCOPIES=k)* &
EOSHIFT(y_t(1:n,1:k),SHIFT=-1,DIM=1) + &
EOSHIFT(spread(b,DIM=2,NCOPIES=k)*y_t(1:n,1:k),SHIFT=1) &
-y_t(1:n,1:k)*spread(EVAL_T(1:k),DIM=1,NCOPIES=n)
!
!
!
!
If the factorization method is Cyclic Reduction and perf_ratio is
larger than one, re-solve using Gaussian Elimination. If the
method is already Gaussian Elimination, the loop exits
and perf_ratio is checked at the end.
perf_ratio = sum(abs(res(1:n,1:k))) / &
sum(abs(EVAL_T(1:k))) / &
epsilon(s_one) / (5*n)
if (perf_ratio <= s_one) exit factorization_choice
iopt(nopt+1) = s_options(s_lin_sol_tri_use_Gauss_elim,s_zero)
end do factorization_choice
if (perf_ratio <= s_one) then
write (*,*) 'Example 3 for LIN_SOL_TRI is correct.'
end if
end
LIN_SOL_TRI
Chapter 1: Linear Systems
77
Output
Example 3 for LIN_SOL_TRI is correct.
Example 4: Tridiagonal Matrix Solving within Diffusion Equations
The normalized partial differential equation
˜ X Ł X
[[
˜ [
XW Ł ˜˜XW
is solved for values of 0 ≤ x ≤ π and t > 0. A boundary value problem consists of choosing the value
X W
X
such that the equation
X [W
X
is satisfied. Arbitrary values
[
ʌ X
and
W
are used for illustration of the solution process. The one-parameter equation
X [W í X
The variables are changed to
Y [W
X [W í X
that v(0, t) = 0. The function v(x, t) satisfies the differential equation. The one-parameter equation solved is
therefore
Y [W í X í X
To solve this equation for X, use the standard technique of the variational equation,
Z Ł ˜˜XY
Thus
LIN_SOL_TRI
Chapter 1: Linear Systems
78
˜Z
˜W
˜Z
˜ [
Since the initial data for
Y [
í X
the variational equation initial condition is
w(x, 0) = –1
This model problem illustrates the method of lines and Galerkin principle implemented with the differentialalgebraic solver, D2SPG (IMSL 2003, pp. 889–911). We use the integrator in “reverse communication” mode
for evaluating the required functions, derivatives, and solving linear algebraic equations. See Example 4 of
routine DASPG for a problem that uses reverse communication. Next see Example 4 of routine IVPAG for the
development of the piecewise-linear Galerkin discretization method to solve the differential equation. This
present example extends parts of both previous examples and illustrates Fortran 90 constructs. It further
illustrates how a user can deal with a defect of an integrator that normally functions using only dense linear
algebra factorization methods for solving the corrector equations. See the comments in Brenan et al. (1989,
esp. p. 137). Also, see operator_ex20, supplied with the product examples.
use lin_sol_tri_int
use rand_gen_int
use Numerical_Libraries
implicit none
! This is Example 4 for LIN_SOL_TRI.
integer, parameter :: n=1000, ichap=5, iget=1, iput=2, &
inum=6, irnum=7
real(kind(1e0)), parameter :: zero=0e0, one = 1e0
integer
i, ido, in(50), inr(20), iopt(6), ival(7), &
iwk(35+n)
real(kind(1e0))
hx, pi_value, t, u_0, u_1, atol, rtol, sval(2), &
tend, wk(41+11*n), y(n), ypr(n), a_diag(n), &
a_off(n), r_diag(n), r_off(n), t_y(n), t_ypr(n), &
t_g(n), t_diag(2*n,1), t_upper(2*n,1), &
t_lower(2*n,1), t_sol(2*n,1)
type(s_options) :: iopti(2)=s_options(0,zero)
character(2) :: pi(1) = 'pi'
! Define initial data.
t = 0.0e0
u_0 = 1
u_1 = 0.5
tend = one
! Initial values for the variational equation.
y = -one; ypr= zero
pi_value = const(pi)
hx = pi_value/(n+1)
LIN_SOL_TRI
Chapter 1: Linear Systems
79
a_diag
a_off
r_diag
r_off
=
=
=
=
2*hx/3
hx/6
-2/hx
1/hx
! Get integer option numbers.
iopt(1) = inum
call iumag ('math', ichap, iget, 1, iopt, in)
! Get floating point option numbers.
iopt(1) = irnum
call iumag ('math', ichap, iget, 1, iopt, inr)
! Set for reverse communication evaluation of the DAE.
iopt(1) = in(26)
ival(1) = 0
! Set for use of explicit partial derivatives.
iopt(2) = in(5)
ival(2) = 1
! Set for reverse communication evaluation of partials.
iopt(3) = in(29)
ival(3) = 0
! Set for reverse communication solution of linear equations.
iopt(4) = in(31)
ival(4) = 0
! Storage for the partial derivative array are not allocated or
! required in the integrator.
iopt(5) = in(34)
ival(5) = 1
! Set the sizes of iwk, wk for internal checking.
iopt(6) = in(35)
ival(6) = 35 + n
ival(7) = 41 + 11*n
! Set integer options:
call iumag ('math', ichap, iput, 6, iopt, ival)
! Reset tolerances for integrator:
atol = 1e-3; rtol= 1e-3
sval(1) = atol; sval(2) = rtol
iopt(1) = inr(5)
! Set floating point options:
call sumag ('math', ichap, iput, 1, iopt, sval)
! Integrate ODE/DAE. Use dummy external names for g(y,y')
! and partials.
ido = 1
Integration_Loop: do
call d2spg (n, t, tend, ido, y, ypr, dgspg, djspg, iwk, wk)
! Find where g(y,y') goes. (It only goes in one place here, but can
! vary where divided differences are used for partial derivatives.)
iopt(1) = in(27)
call iumag ('math', ichap, iget, 1, iopt, ival)
! Direct user response:
select case(ido)
case(1,4)
LIN_SOL_TRI
Chapter 1: Linear Systems
80
! This should not occur.
write (*,*) ' Unexpected return with ido = ', ido
stop
case(3)
! Reset options to defaults. (This is good housekeeping but not
! required for this problem.)
in = -in
call iumag ('math', ichap, iput, 50, in, ival)
inr = -inr
call sumag ('math', ichap, iput, 20, inr, sval)
exit Integration_Loop
case(5)
! Evaluate partials of g(y,y').
t_y = y; t_ypr = ypr
t_g = r_diag*t_y + r_off*EOSHIFT(t_y,SHIFT=+1) &
+ EOSHIFT(r_off*t_y,SHIFT=-1) &
- (a_diag*t_ypr + a_off*EOSHIFT(t_ypr,SHIFT=+1) &
+ EOSHIFT(a_off*t_ypr,SHIFT=-1))
! Move data from the assumed size to assumed shape arrays.
do i=1, n
wk(ival(1)+i-1) = t_g(i)
end do
cycle Integration_Loop
case(6)
! Evaluate partials of g(y,y').
! Get value of c_j for partials.
iopt(1) = inr(9)
call sumag ('math', ichap, iget, 1, iopt, sval)
! Subtract c_j from diagonals to compute (partials for y')*c_j.
! The linear system is tridiagonal.
t_diag(1:n,1) = r_diag - sval(1)*a_diag
t_upper(1:n,1) = r_off - sval(1)*a_off
t_lower = EOSHIFT(t_upper,SHIFT=+1,DIM=1)
cycle Integration_Loop
case(7)
! Compute the factorization.
iopti(1) = s_options(s_lin_sol_tri_factor_only,zero)
call lin_sol_tri (t_upper, t_diag, t_lower, &
t_sol, iopt=iopti)
cycle Integration_Loop
case(8)
! Solve the system.
iopti(1) = s_options(s_lin_sol_tri_solve_only,zero)
! Move data from the assumed size to assumed shape arrays.
t_sol(1:n,1)=wk(ival(1):ival(1)+n-1)
call lin_sol_tri (t_upper, t_diag, t_lower, &
t_sol, iopt=iopti)
LIN_SOL_TRI
Chapter 1: Linear Systems
81
! Move data from the assumed shape to assumed size arrays.
wk(ival(1):ival(1)+n-1)=t_sol(1:n,1)
cycle Integration_Loop
case(2)
! Correct initial value to reach u_1 at t=tend.
u_0 = u_0 - (u_0*y(n/2) - (u_1-u_0)) / (y(n/2) + 1)
! Finish up internally in the integrator.
ido = 3
cycle Integration_Loop
end select
end do Integration_Loop
write (*,*) 'The equation u_t = u_xx, with u(0,t) = ', u_0
write (*,*) 'reaches the value ',u_1, ' at time = ', tend, '.'
write (*,*) 'Example 4 for LIN_SOL_TRI is correct.'
end
Output
Example 4 for LIN_SOL_TRI is correct.
LIN_SOL_TRI
Chapter 1: Linear Systems
82
LIN_SVD
Computes the singular value decomposition (SVD) of a rectangular matrix, A. This gives the decomposition
$
869 7
where V is an n × n orthogonal matrix, U is an m × m orthogonal matrix, and S is a real, rectangular diagonal
matrix.
Required Arguments
A — Array of size m × n containing the matrix. (Input [/Output])
If the packaged option lin_svd_overwrite_input is used, this array is not saved on output.
S—
Array of size min(m, n) containing the real singular values. These nonnegative values are in nonincreasing order. (Output)
U—
V—
Array of size m × m containing the singular vectors, U. (Output)
Array of size n × n containing the singular vectors, V. (Output)
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size(A, 1)
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
RANK = k (Output)
Number of singular values that exceed the value Small. RANK will satisfy k <= min(m, n).
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing optional data to the
routine. The options are as follows:
Packaged Options for LIN_SVD
Option Prefix = ?
Option Name
Option Value
S_, d_, c_, z_
lin_svd_set_small
1
S_, d_, c_, z_
lin_svd_overwrite_input
2
S_, d_, c_, z_
lin_svd_scan_for_NaN
3
S_, d_, c_, z_
lin_svd_use_qr
4
S_, d_, c_, z_
lin_svd_skip_orth
5
S_, d_, c_, z_
lin_svd_use_gauss_elim
6
S_, d_, c_, z_
lin_svd_set_perf_ratio
7
LIN_SVD
Chapter 1: Linear Systems
83
iopt(IO) = ?_options(?_lin_svd_set_small, Small)
If a singular value is smaller than Small, it is defined as zero for the purpose of computing
the rank of A.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_svd_overwrite_input, ?_dummy)
Does not save the input array A(:, :).
iopt(IO) = ?_options(?_lin_svd_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true.
See the isNaN() function, Chapter 10.
Default: The array is not scanned for NaNs.
iopt(IO)= ?_options(?_lin_svd_use_qr, ?_dummy)
Uses a rational QR algorithm to compute eigenvalues. Accumulate the singular vectors using this
algorithm.
Default: singular vectors computed using inverse iteration
iopt(IO) = ?_options(?_lin_svd_skip_Orth, ?_dummy)
If the eigenvalues are computed using inverse iteration, skips the final orthogonalization of the vectors. This method results in a more efficient computation. However, the singular vectors, while a
complete set, may not be orthogonal.
Default: singular vectors are orthogonalized if obtained using inverse iteration
iopt(IO) = ?_options(?_lin_svd_use_gauss_elim, ?_dummy)
If the eigenvalues are computed using inverse iteration, uses standard elimination with partial pivoting to solve the inverse iteration problems.
Default: singular vectors computed using cyclic reduction
iopt(IO) = ?_options(?_lin_svd_set_perf_ratio, perf_ratio)
Uses residuals for approximate normalized singular vectors if they have a performance index no
larger than perf_ratio. Otherwise an alternate approach is taken and the singular vectors are computed
again: Standard elimination is used instead of cyclic reduction, or the standard QR algorithm is used
as a backup procedure to inverse iteration. Larger values of perf_ratio are less likely to cause these
exceptions.
Default: perf_ratio = 4
FORTRAN 90 Interface
Generic:
CALL LIN_SVD (A, S, U, V [, …])
Specific:
The specific interface names are S_LIN_SVD, D_LIN_SVD, C_LIN_SVD, and Z_LIN_SVD.
Description
Routine lin_svd is an implementation of the QR algorithm for computing the SVD of rectangular matrices.
An orthogonal reduction of the input matrix to upper bidiagonal form is performed. Then, the SVD of a real
bidiagonal matrix is calculated. The orthogonal decomposition AV = US results from products of intermediate matrix factors. See Golub and Van Loan (1989, Chapter 8) for details.
LIN_SVD
Chapter 1: Linear Systems
84
Fatal, Terminal, and Warning Error Messages
See the messages.gls file for error messages for LIN_SVD. These error messages are numbered 1001–1010;
1021–1030; 1041–1050; 1061–1070.
Examples
Example 1: Computing the SVD
The SVD of a square, random matrix A is computed. The residuals R = AV – US are small with respect to
working precision. Also, see operator_ex21, supplied with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_SVD.
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) err
real(kind(1d0)), dimension(n,n) :: A, U, V, S(n), y(n*n)
! Generate a random n by n matrix.
call rand_gen(y)
A = reshape(y,(/n,n/))
! Compute the singular value decomposition.
call lin_svd(A, S, U, V)
! Check for small residuals of the expression A*V - U*S.
err = sum(abs(matmul(A,V) - U*spread(S,dim=1,ncopies=n))) &
/ sum(abs(S))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SVD is correct.'
end if
end
Output
Example 1 for LIN_SVD is correct.
Example 2: Linear Least Squares with a Quadratic Constraint
An m × n matrix equation Ax ≅ b, m > n, is approximated in a least-squares sense. The matrix b is size m × k.
Each of the k solution vectors of the matrix x is constrained to have Euclidean length of value αj > 0. The
value of αi is chosen so that the constrained solution is 0.25 the length of the nonregularized or standard
LIN_SVD
Chapter 1: Linear Systems
85
least-squares equation. See Golub and Van Loan (1989, Chapter 12) for more details. In the Example 2 code,
Newton’s method is used to solve for each regularizing parameter of the k systems. The solution is then computed and its length is checked. Also, see operator_ex22, supplied with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_SVD.
integer, parameter :: m=64, n=32, k=4
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) a(m,n), s(n), u(m,m), v(n,n), y(m*max(n,k)), &
b(m,k), x(n,k), g(m,k), alpha(k), lamda(k), &
delta_lamda(k), t_g(n,k), s_sq(n), phi(n,k), &
phi_dot(n,k), rand(k), err
! Generate a random matrix for both A and B.
call rand_gen(y)
a = reshape(y,(/m,n/))
call rand_gen(y)
b = reshape(y,(/m,k/))
! Compute the singular value decomposition.
call lin_svd(a, s, u, v)
! Choose alpha so that the lengths of the regularized solutions
! are 0.25 times lengths of the non-regularized solutions.
g = matmul(transpose(u),b)
x = matmul(v,spread(one/s,dim=2,ncopies=k)*g(1:n,1:k))
alpha = 0.25*sqrt(sum(x**2,dim=1))
t_g = g(1:n,1:k)*spread(s,dim=2,ncopies=k)
s_sq = s**2; lamda = zero
solve_for_lamda: do
x=one/(spread(s_sq,dim=2,ncopies=k)+ &
spread(lamda,dim=1,ncopies=n))
phi = (t_g*x)**2; phi_dot = -2*phi*x
delta_lamda = (sum(phi,dim=1)-alpha**2)/sum(phi_dot,dim=1)
! Make Newton method correction to solve the secular equations for
! lamda.
lamda = lamda - delta_lamda
if (sum(abs(delta_lamda)) <= &
sqrt(epsilon(one))*sum(lamda)) &
exit solve_for_lamda
! This is intended to fix up negative solution approximations.
call rand_gen(rand)
where (lamda < 0) lamda = s(1) * rand
LIN_SVD
Chapter 1: Linear Systems
86
end do solve_for_lamda
! Compute solutions and check lengths.
x = matmul(v,t_g/(spread(s_sq,dim=2,ncopies=k)+ &
spread(lamda,dim=1,ncopies=n)))
err = sum(abs(sum(x**2,dim=1) - alpha**2))/sum(abs(alpha**2))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_SVD is correct.'
end if
end
Output
Example 2 for LIN_SVD is correct.
Example 3: Generalized Singular Value Decomposition
The n × n matrices A and B are expanded in a Generalized Singular Value Decomposition (GSVD). Two
n × n orthogonal matrices, U and V, and a nonsingular matrix X are computed such that
AX = U diag(c1, .…, cn)
and
BX = V diag(s1, …, sn)
The values si and ci are normalized so that
VL FL
The ci are nonincreasing, and the si are nondecreasing. See Golub and Van Loan (1989, Chapter 8) for more
details. Our method is based on computing three SVDs as opposed to the QR decomposition and two SVDs
outlined in Golub and Van Loan. As a bonus, an SVD of the matrix X is obtained, and you can use this information to answer further questions about its conditioning. This form of the decomposition assumes that the
matrix
'
$
%
has all its singular values strictly positive. For alternate problems, where some singular values of D are zero,
the GSVD becomes
UTA = diag(c1, .…, cn)W
and
VTB = diag(s1, …, sn)W
LIN_SVD
Chapter 1: Linear Systems
87
The matrix W has the same singular values as the matrix D. Also, see operator_ex23, supplied with the
product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SVD.
integer, parameter :: n=32
integer i
real(kind(1d0)), parameter :: one=1.0d0
real(kind(1d0)) a(n,n), b(n,n), d(2*n,n), x(n,n), u_d(2*n,2*n), &
v_d(n,n), v_c(n,n), u_c(n,n), v_s(n,n), u_s(n,n), &
y(n*n), s_d(n), c(n), s(n), sc_c(n), sc_s(n), &
err1, err2
! Generate random square matrices for both A and B.
call rand_gen(y)
a = reshape(y,(/n,n/))
call rand_gen(y)
b = reshape(y,(/n,n/))
! Construct D; A is on the top; B is on the bottom.
d(1:n,1:n) = a
d(n+1:2*n,1:n) = b
! Compute the singular value decompositions used for the GSVD.
call lin_svd(d, s_d, u_d, v_d)
call lin_svd(u_d(1:n,1:n), c, u_c, v_c)
call lin_svd(u_d(n+1:,1:n), s, u_s, v_s)
! Rearrange c(:) so it is non-increasing. Move singular
! vectors accordingly. (The use of temporary objects sc_c and
! x is required.)
sc_c = c(n:1:-1); c = sc_c
x = u_c(1:n,n:1:-1); u_c = x
x = v_c(1:n,n:1:-1); v_c = x
! The columns of v_c and v_s have the same span. They are
! equivalent by taking the signs of the largest magnitude values
! positive.
do i=1, n
sc_c(i) = sign(one,v_c(sum(maxloc(abs(v_c(1:n,i)))),i))
sc_s(i) = sign(one,v_s(sum(maxloc(abs(v_s(1:n,i)))),i))
end do
v_c = v_c*spread(sc_c,dim=1,ncopies=n)
LIN_SVD
Chapter 1: Linear Systems
88
u_c = u_c*spread(sc_c,dim=1,ncopies=n)
v_s = v_s*spread(sc_s,dim=1,ncopies=n)
u_s = u_s*spread(sc_s,dim=1,ncopies=n)
! In this form of the GSVD, the matrix X can be unstable if D
! is ill-conditioned.
x = matmul(v_d*spread(one/s_d,dim=1,ncopies=n),v_c)
! Check residuals for GSVD, A*X = u_c*diag(c_1, ..., c_n), and
! B*X = u_s*diag(s_1, ..., s_n).
err1 = sum(abs(matmul(a,x) - u_c*spread(c,dim=1,ncopies=n))) &
/ sum(s_d)
err2 = sum(abs(matmul(b,x) - u_s*spread(s,dim=1,ncopies=n))) &
/ sum(s_d)
if (err1 <= sqrt(epsilon(one)) .and. &
err2 <= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_SVD is correct.'
end if
end
Example 4: Ridge Regression as Cross-Validation with Weighting
This example illustrates a particular choice for the ridge regression problem: The least-squares problem Ax ≅ b
is modified by the addition of a regularizing term to become
PLQ ӝ$[ í Eӝ Ȝӝ[ӝ
[
The solution to this problem, with row k deleted, is denoted by xk(λ). Using nonnegative weights
(w1, …, wm), the cross-validation squared error C(λ) is given by:
P
P& Ȝ
™Z
N
D7N [N Ȝ í EN
N With the SVD A = USVT and product g = UTb, this quantity can be written as
LIN_SVD
Chapter 1: Linear Systems
89
™Z
N
N VM
M VM Ȝ
EN í ™ XNMJ M
P
P& Ȝ
Q
í
VM
™ XNM VM Ȝ
M Q
This expression is minimized. See Golub and Van Loan (1989, Chapter 12) for more details. In the Example 4
code, mC(λ), at p = 10 grid points are evaluated using a log-scale with respect to λ, V ” Ȝ ” V. Array
operations and intrinsics are used to evaluate the function and then to choose an approximate minimum. Following the computation of the optimum λ, the regularized solutions are computed. Also, see
operator_ex24, supplied with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SVD.
integer i
integer, parameter :: m=32, n=16, p=10, k=4
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) log_lamda, log_lamda_t, delta_log_lamda
real(kind(1d0)) a(m,n), b(m,k), w(m,k), g(m,k), t(n), s(n), &
s_sq(n), u(m,m), v(n,n), y(m*max(n,k)), &
c_lamda(p,k), lamda(k), x(n,k), res(n,k)
! Generate random rectangular matrices for A and right-hand
! sides, b.
call rand_gen(y)
a = reshape(y,(/m,n/))
call rand_gen(y)
b = reshape(y,(/m,k/))
! Generate random weights for each of the right-hand sides.
call rand_gen(y)
w = reshape(y,(/m,k/))
! Compute the singular value decomposition.
call lin_svd(a, s, u, v)
g = matmul(transpose(u),b)
s_sq = s**2
log_lamda = log(10.*s(1)); log_lamda_t=log_lamda
delta_log_lamda = (log_lamda - log(0.1*s(n))) / (p-1)
! Choose lamda to minimize the "cross-validation" weighted
! square error. First evaluate the error at a grid of points,
! uniform in log_scale.
LIN_SVD
Chapter 1: Linear Systems
90
cross_validation_error: do i=1, p
t = s_sq/(s_sq+exp(log_lamda))
c_lamda(i,:) = sum(w*((b-matmul(u(1:m,1:n),g(1:n,1:k)* &
spread(t,DIM=2,NCOPIES=k)))/ &
(one-matmul(u(1:m,1:n)**2, &
spread(t,DIM=2,NCOPIES=k))))**2,DIM=1)
log_lamda = log_lamda - delta_log_lamda
end do cross_validation_error
! Compute the grid value and lamda corresponding to the minimum.
do i=1, k
lamda(i) = exp(log_lamda_t - delta_log_lamda* &
(sum(minloc(c_lamda(1:p,i)))-1))
end do
! Compute the solution using the optimum "cross-validation"
! parameter.
x = matmul(v,g(1:n,1:k)*spread(s,DIM=2,NCOPIES=k)/ &
(spread(s_sq,DIM=2,NCOPIES=k)+ &
spread(lamda,DIM=1,NCOPIES=n)))
! Check the residuals, using normal equations.
res = matmul(transpose(a),b-matmul(a,x)) - &
spread(lamda,DIM=1,NCOPIES=n)*x
if (sum(abs(res))/sum(s_sq) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_SVD is correct.'
end if
end
Output
Example 4 for LIN_SVD is correct.
LIN_SVD
Chapter 1: Linear Systems
91
Parallel Constrained Least-Squares Solvers
Solving Constrained Least-Squares Systems
The routine PARALLEL_NONNEGATIVE_LSQ is used to solve dense least-squares systems. These are represented by $[ ӽ E where A is an m × n coefficient data matrix, b is a given right-hand side m-vector, and x is
the solution n-vector being computed. Further, there is a constraint requirement, [ • . The routine
PARALLEL_BOUNDED_LSQ is used when the problem has lower and upper bounds for the solution,
Į ” [ ” ȕ. By making the bounds large, individual constraints can be eliminated. There are no restrictions
on the relative sizes of m and n. When n is large, these codes can substantially reduce computer time and
storage requirements, compared with using a routine for solving a constrained system and a single processor.
The user provides the matrix partitioned by blocks of columns:
$
$ӛ $ӛӛ $N
An individual block of the partitioned matrix, say Ap, is located entirely on the processor with rank
03B5$1. S í , where MP_RANK is packaged in the module MPI_SETUP_INT. This module, and the
function MP_SETUP(),define the Fortran Library MPI communicator, MP_LIBRARY_WORLD. See Chapter 10,
section Dense Matrix Parallelism Using MPI.
Parallel Constrained Least-Squares Solvers
Chapter 1: Linear Systems
92
PARALLEL_NONNEGATIVE_LSQ
more...
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in Chapter 10 of this
manual.
Solves a linear, non-negative constrained least-squares system.
Usage Notes
CALL PARALLEL_NONNEGATIVE_LSQ (A, B, X, RNORM, W, INDEX, IPART, IOPT = IOPT)
Required Arguments
A(1:M,:)— (Input/Output) Columns of the matrix with limits given by entries in the array
IPART(1:2,1:max(1,MP_NPROCS)). On output Ak is replaced by the product QAk, where Q is an
orthogonal matrix. The value SIZE(A,1) defines the value of M. Each processor starts and exits with
its piece of the partitioned matrix.
B(1:M) — (Input/Output) Assumed-size array of length M containing the right-hand side vector, b. On
output b is replaced by the product Qb, where Q is the orthogonal matrix applied to A. All processors
in the communicator start and exit with the same vector.
X(1:N) — (Output) Assumed-size array of length N containing the solution, x≥0. The value SIZE(X)
defines the value of N. All processors exit with the same vector.
RNORM — (Output) Scalar that contains the Euclidean or least-squares length of the residual vector,
ӝ$[ í Eӝ. All processors exit with the same value.
W(1:N) — (Output) Assumed-size array of length N containing the dual vector, Z
processors exit with the same vector.
$7 E í $[ ” . All
INDEX(1:N) — (Output) Assumed-size array of length N containing the NSETP indices of columns in the
positive solution, and the remainder that are at their constraint. The number of positive components
in the solution x is given by the Fortran intrinsic function value, NSETP=COUNT(X > 0). All processors exit with the same array.
IPART(1:2,1:max(1,MP_NPROCS)) — (Input) Assumed-size array containing the partitioning describing
the matrix A. The value MP_NPROCS is the number of processors in the communicator, except when
MPI has been finalized with a call to the routine MP_SETUP(‘Final’). This causes MP_NPROCS to be
assigned 0. Normally users will give the partitioning to processor of rank = MP_RANK by setting
IPART(1,MP_RANK+1)= first column index, and IPART(2,MP_RANK+1)= last column index. The
number of columns per node is typically based on their relative computing power. To avoid a node
with rank MP_RANK doing any work except communication, set IPART(1,MP_RANK+1) = 0 and
IPART(2,MP_RANK+1)= -1. In this exceptional case there is no reference to the array A(:,:) at that
node.
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
93
Optional Argument
IOPT(:)— (Input) Assumed-size array of derived type S_OPTIONS or D_OPTIONS. This argument is used
to change internal parameters of the algorithm. Normally users will not be concerned about this argument, so they would not include it in the argument list for the routine.
Packaged Options for PARALLEL_NONNEGATIVE_LSQ
Option Name
Option Value
PNLSQ_SET_TOLERANCE
1
PNLSQ_SET_MAX_ITERATIONS
2
PNLSQ_SET_MIN_RESIDUAL
3
IOPT(IO)=?_OPTIONS(PNLSQ_SET_TOLERANCE, TOLERANCE) Replaces the default rank tolerance for using a column, from EPSILON(TOLERANCE) to TOLERANCE. Increasing the value of
TOLERANCE will cause fewer columns to be moved from their constraints, and may cause the minimum residual RNORM to increase.
IOPT(IO)=?_OPTIONS(PNLSQ_SET_MIN_RESIDUAL, RESID) Replaces the default target for the
minimum residual vector length from 0 to RESID. Increasing the value of RESID can result in fewer
iterations and thus increased efficiency. The descent in the optimization will stop at the first point
where the minimum residual RNORM is smaller than RESID. Using this option may result in the dual
vector not satisfying its optimality conditions, as noted above.
IOPT(IO)= PNLSQ_SET_MAX_ITERATIONS
IOPT(IO+1)= NEW_MAX_ITERATIONS Replaces the default maximum number of iterations from
3*N to NEW_MAX_ITERATIONS. Note that this option requires two entries in the derived type array.
FORTRAN 90 Interface
Generic:
CALL PARALLEL_NONNEGATIVE_LSQ (A, B, X, RNORM, W, INDEX, IPART [, …])
Specific:
The specific interface names are S_PARALLEL_NONNEGATIVE_LSQ and
D_PARALLEL_NONNEGATIVE_LSQ.
Description
Subroutine PARALLEL_NONNEGATIVE_LSQ solves the linear least-squares system $[ ӽ E[ • , using the
algorithm NNLS found in Lawson and Hanson, (1995), pages 160-161. The code now updates the dual vector
Z of Step 2, page 161. The remaining new steps involve exchange of required data, using MPI.
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
94
Examples
Example 1: Distributed Linear Inequality Constraint Solver
The program PNLSQ_EX1 illustrates the computation of the minimum Euclidean length solution of an
P ƍ î Q ƍ system of linear inequality constraints , *\ • K. The solution algorithm is based on Algorithm LDP,
*K are partitioned and assigned random values. When the minipage 165-166, loc. cit. The rows of (
mum Euclidean length solution to the inequalities has been calculated, the residuals U *\ í K • are
computed, with the dual variables to the NNLS problem indicating the entries of r that are precisely zero.
The fact that matrix products involving both E and ET are needed to compute the constrained solution y and
the residuals r, implies that message passing is required. This occurs after the NNLS solution is computed.
!
!
!
!
!
PROGRAM PNLSQ_EX1
Use Parallel_nonnegative_LSQ to solve an inequality
constraint problem, Gy >= h. This algorithm uses
Algorithm LDP of Solving Least Squares Problems,
page 165. The constraints are allocated to the
processors, by rows, in columns of the array A(:,:).
USE PNLSQ_INT
USE MPI_SETUP_INT
USE RAND_INT
USE SHOW_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: MP=500, NP=400, M=NP+1, N=MP
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), X(:), Y(:), W(:), ASAVE(:,:)
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR
LOGICAL :: PRINT=.false.
! Setup for MPI:
MP_NPROCS=MP_SETUP()
DN=N/max(1,max(1,MP_NPROCS))-1
ALLOCATE(IPART(2,max(1,MP_NPROCS)))
! Spread constraint rows evenly to the processors.
IPART(1,1)=1
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=N
! Define the constraint data using random values.
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
95
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(A(M,K), ASAVE(M,K), X(N), W(N), &
B(M), Y(M), INDEX(N))
! The use of ASAVE can be removed by regenerating
! the data for A(:,:) after the return from
! Parallel_nonnegative_LSQ.
A=rand(A); ASAVE=A
IF(MP_RANK == 0 .and. PRINT) &
CALL SHOW(IPART, &
"Partition of the constraints to be solved")
! Set the right-hand side to be one in the last component, zero elsewhere.
B=ZERO;B(M)=ONE
! Solve the dual problem.
CALL Parallel_nonnegative_LSQ &
(A, B, X, RNORM, W, INDEX, IPART)
! Each processor multiplies its block times the part of
! the dual corresponding to that part of the partition.
Y=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
Y=Y+ASAVE(:,JSHIFT)*X(J)
END DO
! Accumulate the pieces from all the processors. Put sum into B(:)
! on rank 0 processor.
B=Y
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(Y, B, M, MPI_DOUBLE_PRECISION,&
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0) THEN
! Compute constrained solution at the root.
! The constraints will have no solution if B(M) = ONE.
! All of these example problems have solutions.
B(M)=B(M)-ONE;B=-B/B(M)
END IF
! Send the inequality constraint solution to all nodes.
IF(MP_NPROCS > 1) &
CALL MPI_BCAST(B, M, MPI_DOUBLE_PRECISION, &
0, MP_LIBRARY_WORLD, IERROR)
! For large problems this printing needs to be removed.
IF(MP_RANK == 0 .and. PRINT) &
CALL SHOW(B(1:NP), &
"Minimal length solution of the constraints")
! Compute residuals of the individual constraints.
! If only the solution is desired, the program ends here.
X=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
96
JSHIFT=J-IPART(1,MP_RANK+1)+1
X(J)=dot_product(B,ASAVE(:,JSHIFT))
END DO
!
!
!
!
This cleans up residuals that are about rounding
error unit (times) the size of the constraint
equation and right-hand side. They are replaced
by exact zero.
WHERE(W == ZERO) X=ZERO; W=X
! Each group of residuals is disjoint, per processor.
! We add all the pieces together for the total set of
! constraints.
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(X, W, N, MPI_DOUBLE_PRECISION,&
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0 .and. PRINT) &
CALL SHOW(W, "Residuals for the constraints")
! See to any errors and shut down MPI.
MP_NPROCS=MP_SETUP('Final')
IF(MP_RANK == 0) THEN
IF(COUNT(W < ZERO) == 0) WRITE(*,*)&
" Example 1 for PARALLEL_NONNEGATIVE_LSQ is correct."
END IF
END
Output
Example 1 for PARALLEL_NONNEGATIVE_LSQ is correct.
Example 2: Distributed Non-negative Least-Squares
The program PNLSQ_EX2 illustrates the computation of the solution to a system of linear least-squares equa7
tions with simple constraints: DL
[ ӽ ELL
P subject to [ • . In this example we write the row vectors
7
L
D EL on a file. This illustrates reading the data by rows and arranging the data by columns, as required by
PARALLEL_NONNEGATIVE_LSQ. After reading the data, the right-hand side vector is broadcast to the group
before computing a solution, [. The block-size is chosen so that each participating processor receives the same
number of columns, except any remaining columns sent to the processor with largest rank. This processor
contains the right-hand side before the broadcast.
This example illustrates connecting a BLACS ‘context’ handle and the Fortran Library MPI communicator,
MP_LIBRARY_WORLD, described in Chapter 10.
!
!
!
!
!
!
PROGRAM PNLSQ_EX2
Use Parallel_Nonnegative_LSQ to solve a least-squares
problem, A x = b, with x >= 0. This algorithm uses a
distributed version of NNLS, found in the book
Solving Least Squares Problems, page 165. The data is
read from a file, by rows, and sent to the processors,
as array columns.
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
97
USE PNLSQ_INT
USE SCALAPACK_IO_INT
USE BLACS_INT
USE MPI_SETUP_INT
USE RAND_INT
USE ERROR_OPTION_PACKET
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: M=128, N=32, NP=N+1, NIN=10
real(kind(1d0)), ALLOCATABLE, DIMENSION(:) :: &
d_A(:,:), A(:,:), B, C, W, X, Y
real(kind(1d0)) RNORM, ERROR
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER I, J, K, L, DN, JSHIFT, IERROR, &
CONTXT, NPROW, MYROW, MYCOL, DESC_A(9)
TYPE(d_OPTIONS) IOPT(1)
! Routines with the "BLACS_" prefix are from the
! BLACS library.
CALL BLACS_PINFO(MP_RANK, MP_NPROCS)
! Make initialization for BLACS.
CALL BLACS_GET(0,0, CONTXT)
! Define processor grid to be 1 by MP_NPROCS.
NPROW=1
CALL BLACS_GRIDINIT(CONTXT, 'N/A', NPROW, MP_NPROCS)
! Get this processor's role in the process grid.
CALL BLACS_GRIDINFO(CONTXT, NPROW, MP_NPROCS, &
MYROW, MYCOL)
! Connect BLACS context with communicator MP_LIBRARY_WORLD.
CALL BLACS_GET(CONTXT, 10, MP_LIBRARY_WORLD)
! Setup for MPI:
MP_NPROCS=MP_SETUP()
DN=max(1,NP/MP_NPROCS)
ALLOCATE(IPART(2,MP_NPROCS))
! Spread columns evenly to the processors. Any odd
! number of columns are in the processor with highest
! rank.
IPART(1,:)=1; IPART(2,:)=0
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
98
IPART(2,MP_NPROCS)=NP
IPART(2,:)=min(NP,IPART(2,:))
! Note which processor (L-1) receives the right-hand side.
DO L=1,MP_NPROCS
IF(IPART(1,L) <= NP .and. NP <= IPART(2,L)) EXIT
END DO
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(d_A(M,K), W(N), X(N), Y(N),&
B(M), C(M), INDEX(N))
IF(MP_RANK == 0 ) THEN
ALLOCATE(A(M,N))
! Define the matrix data using random values.
A=rand(A); B=rand(B)
! Write the rows of data to an external file.
OPEN(UNIT=NIN, FILE='Atest.dat', STATUS='UNKNOWN')
DO I=1,M
WRITE(NIN,*) (A(I,J),J=1,N), B(I)
END DO
CLOSE(NIN)
ELSE
! No resources are used where this array is not saved.
ALLOCATE(A(M,0))
END IF
!
!
!
!
Define the matrix descriptor. This includes the
right-hand side as an additional column. The row
block size, on each processor, is arbitrary, but is
chosen here to match the column block size.
DESC_A=(/1, CONTXT, M, NP, DN+1, DN+1, 0, 0, M/)
! Read the data by rows.
IOPT(1)=ScaLAPACK_READ_BY_ROWS
CALL ScaLAPACK_READ ("Atest.dat", DESC_A, &
d_A, IOPT=IOPT)
! Broadcast the right-hand side to all processors.
JSHIFT=NP-IPART(1,L)+1
IF(K > 0) B=d_A(:,JSHIFT)
IF(MP_NPROCS > 1) &
CALL MPI_BCAST(B, M, MPI_DOUBLE_PRECISION , L-1, &
MP_LIBRARY_WORLD, IERROR)
! Adjust the partition of columns to ignore the
! last column, which is the right-hand side. It is
! now moved to B(:).
IPART(2,:)=min(N,IPART(2,:))
! Solve the constrained distributed problem.
C=B
CALL Parallel_Nonnegative_LSQ &
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
99
(d_A, B, X, RNORM, W, INDEX, IPART)
! Solve the problem on one processor, with data saved
! for a cross-check.
IPART(2,:)=0; IPART(2,1)=N; MP_NPROCS=1
! Since all processors execute this code, all arrays
! must be allocated in the main program.
CALL Parallel_Nonnegative_LSQ &
(A, C, Y, RNORM, W, INDEX, IPART)
! See to any errors.
CALL e1pop("Mp_Setup")
! Check the differences in the two solutions. Unique solutions
! may differ in the last bits, due to rounding.
IF(MP_RANK == 0) THEN
ERROR=SUM(ABS(X-Y))/SUM(Y)
IF(ERROR <= sqrt(EPSILON(ERROR))) write(*,*) &
' Example 2 for PARALLEL_NONNEGATIVE_LSQ is correct.'
OPEN(UNIT=NIN, FILE='Atest.dat', STATUS='OLD')
CLOSE(NIN, STATUS='Delete')
END IF
! Exit from using this process grid.
CALL BLACS_GRIDEXIT( CONTXT )
CALL BLACS_EXIT(0)
END
Output
Example 2 for PARALLEL_NONNEGATIVE_LSQ is correct.'
PARALLEL_NONNEGATIVE_LSQ
Chapter 1: Linear Systems
100
PARALLEL_BOUNDED_LSQ
more...
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in Chapter 10 of this
manual.
Solves a linear least-squares system with bounds on the unknowns.
Usage Notes
CALL PARALLEL_BOUNDED_LSQ (A, B, BND, X, RNORM, W, INDEX, IPART, NSETP, NSETZ,
IOPT=IOPT)
Required Arguments
A(1:M,:)— (Input/Output) Columns of the matrix with limits given by entries in the array
IPART(1:2,1:max(1,MP_NPROCS)). On output $N is replaced by the product 4$N , where 4is an
orthogonal matrix. The value SIZE(A,1) defines the value of M. Each processor starts and exits with
its piece of the partitioned matrix.
B(1:M) — (Input/Output) Assumed-size array of length M containing the right-hand side vector, E. On
output E is replaced by the product 4 E í $J , where 4is the orthogonal matrix applied to $ and J
is a set of active bounds for the solution. All processors in the communicator start and exit with the
same vector.
BND(1:2,1:N) — (Input) Assumed-size array containing the bounds for [. The lower bound Į M is in
BND(1,J), and the upper bound ȕ M is in BND(2,J).
X(1:N) — (Output) Assumed-size array of length N containing the solution, Į
SIZE(X) defines the value of N. All processors exit with the same vector.
” [ ” ȕ. The value
RNORM — (Output) Scalar that contains the Euclidean or least-squares length of the residual vector,
ӝ$[ í Eӝ. All processors exit with the same value.
W(1:N) — (Output) Assumed-size array of length N containing the dual vector, Z
solution exactly one of the following is true for each M ” M ” Q
$7 E í $[ . At a
∙αj = xj = βj, and wj arbitrary
∙αj = xj, and wj ≤ 0
∙xj = βj, and wj ≥ 0
∙αj < xj < βj, and wj = 0
All processors exit with the same vector.
INDEX(1:N) — (Output) Assumed-size array of length N containing the NSETP indices of columns in the
solution interior to bounds, and the remainder that are at a constraint. All processors exit with the
same array.
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
101
IPART(1:2,1:max(1,MP_NPROCS)) — (Input) Assumed-size array containing the partitioning describing
the matrix A. The value MP_NPROCS is the number of processors in the communicator, except when
MPI has been finalized with a call to the routine MP_SETUP(‘Final’). This causes MP_NPROCS to be
assigned 0. Normally users will give the partitioning to processor of rank = MP_RANK by setting
IPART(1,MP_RANK+1)= first column index, and IPART(2,MP_RANK+1)= last column index. The
number of columns per node is typically based on their relative computing power. To avoid a node
with rank MP_RANK doing any work except communication, set IPART(1,MP_RANK+1) = 0 and
IPART(2,MP_RANK+1)= -1. In this exceptional case there is no reference to the array A(:,:) at that
node.
NSETP— (Output) An INTEGER indicating the number of solution components not at constraints. The
column indices are output in the array INDEX(:).
NSETZ— (Output) An INTEGER indicating the solution components held at fixed values. The column
indices are output in the array INDEX(:).
Optional Argument
IOPT(:)— (Input) Assumed-size array of derived type S_OPTIONS or D_OPTIONS. This argument is used
to change internal parameters of the algorithm. Normally users will not be concerned about this argument, so they would not include it in the argument list for the routine.
Packaged Options for PARALLEL_BOUNDED_LSQ
Option Name
Option Value
PBLSQ_SET_TOLERANCE
1
PBLSQ_SET_MAX_ITERATIONS
2
PBLSQ_SET_MIN_RESIDUAL
3
IOPT(IO)=?_OPTIONS(PBLSQ_SET_TOLERANCE, TOLERANCE) Replaces the default rank tolerance for using a column, from EPSILON(TOLERANCE) to TOLERANCE. Increasing the value of
TOLERANCE will cause fewer columns to be increased from their constraints, and may cause the minimum residual RNORM to increase.
IOPT(IO)=?_OPTIONS(PBLSQ_SET_MIN_RESIDUAL, RESID) Replaces the default target for the
minimum residual vector length from 0 to RESID. Increasing the value of RESID can result in fewer
iterations and thus increased efficiency. The descent in the optimization will stop at the first point
where the minimum residual RNORM is smaller than RESID. Using this option may result in the dual
vector not satisfying its optimality conditions, as noted above.
IOPT(IO)= PBLSQ_SET_MAX_ITERATIONS
IOPT(IO+1)= NEW_MAX_ITERATIONS Replaces the default maximum number of iterations from
3*N to NEW_MAX_ITERATIONS. Note that this option requires two entries in the derived type array.
FORTRAN 90 Interface
Generic:
CALL PARALLEL_BOUNDED_LSQ (A, B, X [, …])
Specific:
The specific interface names are S_PARALLEL_BOUNDED_LSQ and
D_PARALLEL_BOUNDED_LSQ.
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
102
Description
Subroutine PARALLEL_BOUNDED_LSQ solves the least-squares linear system $[ ӽ EĮ ” [ ” ȕ, using the
algorithm BVLS found in Lawson and Hanson, (1995), pages 279-283. The new steps involve updating the
dual vector and exchange of required data, using MPI. The optional changes to default tolerances, minimum
residual, and the number of iterations are new features.
Examples
Example 1: Distributed Equality and Inequality Constraint Solver
The program PBLSQ_EX1 illustrates the computation of the minimum Euclidean length solution of an
P ƍ î Q ƍ system of linear inequality constraints , *\ • K. Additionally the first I ! of the constraints are
equalities. The solution algorithm is based on Algorithm LDP, page 165-166, loc. cit. By allowing the dual
*K are partitioned and assigned
variables to be free, the constraints become equalities. The rows of (
random values. When the minimum Euclidean length solution to the inequalities has been calculated, the
residuals U *\ í K • are computed, with the dual variables to the BVLS problem indicating the entries
of U that are exactly zero.
!
!
!
!
!
!
!
PROGRAM PBLSQ_EX1
Use Parallel_bounded_LSQ to solve an inequality
constraint problem, Gy >= h. Force F of the constraints
to be equalities. This algorithm uses LDP of
Solving Least Squares Problems, page 165.
Forcing equality constraints by freeing the dual is
new here. The constraints are allocated to the
processors, by rows, in columns of the array A(:,:).
USE PBLSQ_INT
USE MPI_SETUP_INT
USE RAND_INT
USE SHOW_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: MP=500, NP=400, M=NP+1, &
N=MP, F=NP/10
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), BND(:,:), X(:), Y(:), &
W(:), ASAVE(:,:)
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR, NSETP, NSETZ
LOGICAL :: PRINT=.false.
! Setup for MPI:
MP_NPROCS=MP_SETUP()
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
103
DN=N/max(1,max(1,MP_NPROCS))-1
ALLOCATE(IPART(2,max(1,MP_NPROCS)))
! Spread constraint rows evenly to the processors.
IPART(1,1)=1
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=N
! Define the constraints using random data.
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(A(M,K), ASAVE(M,K), BND(2,N), &
X(N), W(N), B(M), Y(M), INDEX(N))
! The use of ASAVE can be replaced by regenerating the
! data for A(:,:) after the return from
! Parallel_bounded_LSQ
A=rand(A); ASAVE=A
IF(MP_RANK == 0 .and. PRINT) &
call show(IPART,&
"Partition of the constraints to be solved")
! Set the right-hand side to be one in the last
! component, zero elsewhere.
B=ZERO;B(M)=ONE
! Solve the dual problem. Letting the dual variable
! have no constraint forces an equality constraint
! for the primal problem.
BND(1,1:F)=-HUGE(ONE); BND(1,F+1:)=ZERO
BND(2,:)=HUGE(ONE)
CALL Parallel_bounded_LSQ &
(A, B, BND, X, RNORM, W, INDEX, IPART, &
NSETP, NSETZ)
! Each processor multiplies its block times the part
! of the dual corresponding to that partition.
Y=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
Y=Y+ASAVE(:,JSHIFT)*X(J)
END DO
! Accumulate the pieces from all the processors.
! Put sum into B(:) on rank 0 processor.
B=Y
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(Y, B, M, MPI_DOUBLE_PRECISION,&
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0) THEN
! Compute constraint solution at the root.
! The constraints will have no solution if B(M) = ONE.
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
104
! All of these example problems have solutions.
B(M)=B(M)-ONE;B=-B/B(M)
END IF
! Send the inequality constraint or primal solution to all nodes.
IF(MP_NPROCS > 1) &
CALL MPI_BCAST(B, M, MPI_DOUBLE_PRECISION, 0, &
MP_LIBRARY_WORLD, IERROR)
! For large problems this printing may need to be removed.
IF(MP_RANK == 0 .and. PRINT) &
call show(B(1:NP), &
"Minimal length solution of the constraints")
! Compute residuals of the individual constraints.
X=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
X(J)=dot_product(B,ASAVE(:,JSHIFT))
END DO
! This cleans up residuals that are about rounding error
! unit (times) the size of the constraint equation and
! right-hand side. They are replaced by exact zero.
WHERE(W == ZERO) X=ZERO
W=X
! Each group of residuals is disjoint, per processor.
! We add all the pieces together for the total set of
! constraints.
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(X, W, N, MPI_DOUBLE_PRECISION, &
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0 .and. PRINT) &
call show(W, "Residuals for the constraints")
! See to any errors and shut down MPI.
MP_NPROCS=MP_SETUP('Final')
IF(MP_RANK == 0) THEN
IF(COUNT(W < ZERO) == 0 .and.&
COUNT(W == ZERO) >= F) WRITE(*,*)&
" Example 1 for PARALLEL_BOUNDED_LSQ is correct."
END IF
END
Output
Example 1 for PARALLEL_BOUNDED_LSQ is correct.
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
105
Example 2: Distributed Newton-Raphson Method with Step Control
The program PBLSQ_EX2 illustrates the computation of the solution of a non-linear system of equations. We
use a constrained Newton-Raphson method.
This algorithm works with the problem chosen for illustration. The step-size control used here, employing
only simple bounds, may not work on other non-linear systems of equations. Therefore we do not recommend
the simple non-linear solving technique illustrated here for an arbitrary problem. The test case is Brown’s
Almost Linear Problem, Moré, et al. (1982). The components are given by:
Q
‡ I L[
[L ™ [ M í Q L
‡ I Q[
[ « [Q í The functions are zero at the point [
Q
equation Qį
« Q í M í Q įQí 7
įįįíQ , where į ! is a particular root of the polynomial
. To avoid convergence to the local minimum [
7
Q , we
7
[
and develop the Newton method using the linear terms
I [ í \ § I [ í - [ \ ӽ , where - [ is the Jacobian matrix. The update is constrained so that
the first Q í components satisfy [ M í \ M • , or \ M ” [ M í . The last component is bounded from
both sides, [Q í \Q ” , or [Q ! \Q • [Q í . These bounds avoid the local minimum and allow
start at the standard point
Q
us to replace the last equation by
™ OQ [ M
M , which is better scaled than the original. The positive lower
bound for [Q í
\Q is replaced by the strict bound, EPSILON(1D0), the arithmetic precision, which restricts
the relative accuracy of [Q. The input for routine PARALLEL_BOUNDED_LSQ expects each processor to obtain
that part of - [ it owns. Those columns of the Jacobian matrix correspond to the partition given in the
array IPART(:,:). Here the columns of the matrix are evaluated, in parallel, on the nodes where they are
required.
!
!
!
!
PROGRAM PBLSQ_EX2
Use Parallel_bounded_LSQ to solve a non-linear system
of equations. The example is an ACM-TOMS test problem,
except for the larger size. It is "Brown's Almost Linear
Function."
USE ERROR_OPTION_PACKET
USE PBLSQ_INT
USE MPI_SETUP_INT
USE SHOW_INT
USE Numerical_Libraries, ONLY : N1RTY
IMPLICIT NONE
INTEGER, PARAMETER :: N=200, MAXIT=5
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0,&
HALF=5D-1, TWO=2D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), BND(:,:), X(:), Y(:), W(:)
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
106
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR, NSETP, &
NSETZ, ITER
LOGICAL :: PRINT=.false.
TYPE(D_OPTIONS) IOPT(3)
! Setup for MPI:
MP_NPROCS=MP_SETUP()
DN=N/max(1,max(1,MP_NPROCS))-1
ALLOCATE(IPART(2,max(1,MP_NPROCS)))
! Spread Jacobian matrix columns evenly to the processors.
IPART(1,1)=1
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=N
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(A(N,K), BND(2,N), &
X(N), W(N), B(N), Y(N), INDEX(N))
! This is Newton's method on "Brown's almost
! linear function."
X=HALF
ITER=0
! Turn off messages and stopping for FATAL class errors.
CALL ERSET (4, 0, 0)
NEWTON_METHOD: DO
! Set bounds for the values after the step is taken.
! All variables are positive and bounded below by HALF,
! except for variable N, which has an upper bound of HALF.
BND(1,1:N-1)=-HUGE(ONE)
BND(2,1:N-1)=X(1:N-1)-HALF
BND(1,N)=X(N)-HALF
BND(2,N)=X(N)-EPSILON(ONE)
! Compute the residual function.
B(1:N-1)=SUM(X)+X(1:N-1)-(N+1)
B(N)=LOG(PRODUCT(X))
if(mp_rank == 0 .and. PRINT) THEN
CALL SHOW(B, &
"Developing non-linear function residual")
END IF
IF (MAXVAL(ABS(B(1:N-1))) <= SQRT(EPSILON(ONE)))&
EXIT NEWTON_METHOD
! Compute the derivatives local to each processor.
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
107
A(1:N-1,:)=ONE
DO J=1,N-1
IF(J < IPART(1,MP_RANK+1)) CYCLE
IF(J > IPART(2,MP_RANK+1)) CYCLE
JSHIFT=J-IPART(1,MP_RANK+1)+1
A(J,JSHIFT)=TWO
END DO
A(N,:)=ONE/X(IPART(1,MP_RANK+1):IPART(2,MP_RANK+1))
! Reset the linear independence tolerance.
IOPT(1)=D_OPTIONS(PBLSQ_SET_TOLERANCE,&
sqrt(EPSILON(ONE)))
IOPT(2)=PBLSQ_SET_MAX_ITERATIONS
! If N iterations was not enough on a previous iteration, reset to 2*N.
IF(N1RTY(1) == 0) THEN
IOPT(3)=N
ELSE
IOPT(3)=2*N
CALL E1POP('MP_SETUP')
CALL E1PSH('MP_SETUP')
END IF
CALL parallel_bounded_LSQ &
(A, B, BND, Y, RNORM, W, INDEX, IPART, NSETP, &
NSETZ,IOPT=IOPT)
! The array Y(:) contains the constrained Newton step.
! Update the variables.
X=X-Y
IF(mp_rank == 0 .and. PRINT) THEN
CALL show(BND, "Bounds for the moves")
CALL SHOW(X, "Developing Solution")
CALL SHOW((/RNORM/), &
"Linear problem residual norm")
END IF
! This is a safety measure for not taking too many steps.
ITER=ITER+1
IF(ITER > MAXIT) EXIT NEWTON_METHOD
END DO NEWTON_METHOD
IF(MP_RANK == 0) THEN
IF(ITER <= MAXIT) WRITE(*,*)&
" Example 2 for PARALLEL_BOUNDED_LSQ is correct."
END IF
! See to any errors and shut down MPI.
MP_NPROCS=MP_SETUP('Final')
END
PARALLEL_BOUNDED_LSQ
Chapter 1: Linear Systems
108
LSARG
more...
more...
Solves a real general system of linear equations with iterative refinement.
Required Arguments
A — N by N matrix containing the coefficients of the linear system. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSARG (A, B, X [, …])
Specific:
The specific interface names are S_LSARG and D_LSARG.
FORTRAN 77 Interface
Single:
CALL LSARG (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSARG
ScaLAPACK Interface
Generic:
CALL LSARG (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSARG and D_LSARG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSARG
Chapter 1: Linear Systems
109
Description
Routine LSARG solves a system of linear algebraic equations having a real general coefficient matrix. It first
uses routine LFCRG to compute an LU factorization of the coefficient matrix and to estimate the condition
number of the matrix. The solution of the linear system is then found using the iterative refinement routine
LFIRG. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon
which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
LSARG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A is singular or very close to a singular
matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSARG solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ARG/DL2ARG. The reference is:
CALL L2ARG (N, A, LDA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — Work vector of length N2 containing the LU factorization of A on output.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Work vector of length N.
2
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficients of the linear system. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains the
right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
LSARG
Chapter 1: Linear Systems
110
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of three linear equations is solved. The coefficient matrix has real general form and the right-handside vector b has three elements.
USE LSARG_INT
USE WRRRN_INT
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
REAL
!
LDA, N
(LDA=3, N=3)
A(LDA,N), B(N), X(N)
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B =
(/129.0, -96.0,
!
8.5/)
Solve the system of equations
CALL LSARG (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
Output
X
1
1.000
2
1.500
3
1.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has real general form and the right-hand-side vector b has three elements. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap arrays to and from the
processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the
descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSARG_INT
USE WRRRN_INT
LSARG
Chapter 1: Linear Systems
111
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
N, DESCA(9), DESCX(9)
INTEGER
INFO, MXLDA, MXCOL
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF (MP_RANK .EQ. 0) THEN
ALLOCATE (A(N,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B = (/129.0, -96.0,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
8.5/)
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
AND MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSARG (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF (MP_RANK .EQ. 0) CALL WRRRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
LSARG
Chapter 1: Linear Systems
112
Output
X
1
1.000
2
1.500
3
1.000
LSARG
Chapter 1: Linear Systems
113
LSLRG
more...
more...
Solves a real general system of linear equations without iterative refinement.
Required Arguments
A — N by N matrix containing the coefficients of the linear system. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved. IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLRG (A, B, X [, …])
Specific:
The specific interface names are S_LSLRG and D_LSLRG.
FORTRAN 77 Interface
Single:
CALL LSLRG (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSLRG.
ScaLAPACK Interface
Generic:
CALL LSLRG (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSLRG and D_LSLRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSLRG
Chapter 1: Linear Systems
114
Description
Routine LSLRG solves a system of linear algebraic equations having a real general coefficient matrix. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual. LSLRG first uses the routine LFCRG to
compute an LU factorization of the coefficient matrix based on Gauss elimination with partial pivoting.
Experiments were analyzed to determine efficient implementations on several different computers. For some
supercomputers, particularly those with efficient vendor-supplied BLAS, versions that call Level 1, 2 and 3
BLAS are used. The remaining computers use a factorization method provided to us by Dr. Leonard J. Harding of the University of Michigan. Harding’s work involves “loop unrolling and jamming” techniques that
achieve excellent performance on many computers. Using an option, LSLRG will estimate the condition number of the matrix. The solution of the linear system is then found using LFSRG.
The routine LSLRG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
occurs only if A is close to a singular matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that small changes in A can cause large changes in the solution x. If the coefficient
matrix is ill-conditioned or poorly scaled, it is recommended that either LIN_SOL_SVD or LSARG be used.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LRG/DL2LRG. The reference is:
CALL L2LRG (N, A, LDA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — N × N work array containing the LU factorization of A on output. If A is not needed, A and
FACT can share the same storage locations. See Item 3 below to avoid memory bank conflicts.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11, Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LRG the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2);
respectively, in LSLRG. Additional memory allocation for FACT and option value restoration are
done automatically in LSLRG. Users directly calling L2LRG can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLRG or L2LRG.
Default values for the option are
IVAL(*) = 1, 16, 0, 1.
LSLRG
Chapter 1: Linear Systems
115
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLRG temporarily replaces IVAL(2) by IVAL(1). The routine L2CRG computes the condition
number if IVAL(2) = 2. Otherwise L2CRG skips this computation. LSLRG restores the option.
Default values for the option are
IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficients of the linear system. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
A system of three linear equations is solved. The coefficient matrix has real general form and the right-handside vector b has three elements.
USE LSLRG_INT
USE WRRRN_INT
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
REAL
!
LDA, N
(LDA=3, N=3)
A(LDA,N), B(N), X(N)
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B = (/129.0 -96.0
!
8.5/)
Solve the system of equations
CALL LSLRG (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
LSLRG
Chapter 1: Linear Systems
116
Output
X
1
1.000
2
1.500
3
1.000
Example 2
A system of N = 16 linear equations is solved using the routine L2LRG. The option manager is used to eliminate memory bank conflict inefficiencies that may occur when the matrix dimension is a multiple of 16. The
leading dimension of FACT=A is increased from N to N+IVAL(3)=17, since N=16=IVAL(4). The data used
for the test is a nonsymmetric Hadamard matrix and a right-hand side generated by a known solution, xj = j,
j = 1, …, N.
USE L2LRG_INT
USE IUMAG_INT
USE WRRRN_INT
USE SGEMV_INT
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
!
LDA, N
(LDA=17, N=16)
SPECIFICATIONS FOR PARAMETERS
ICHP, IPATH, IPUT, KBANK
ONE, ZERO
(ICHP=1, IPATH=1, IPUT=2, KBANK=16, ONE=1.0E0, &
ZERO=0.0E0)
SPECIFICATIONS FOR LOCAL VARIABLES
INTEGER
I, IPVT(N), J, K, NN
REAL
A(LDA,N), B(N), WK(N), X(N)
SPECIFICATIONS FOR SAVE VARIABLES
INTEGER
IOPT(1), IVAL(4)
SAVE
IVAL
Data for option values.
DATA IVAL/1, 16, 1, 16/
Set values for A and B:
A(1,1) = ONE
NN
= 1
Generate Hadamard matrix.
DO 20 K=1, 4
DO 10 J=1, NN
DO 10 I=1, NN
A(NN+I,J) = -A(I,J)
A(I,NN+J) = A(I,J)
A(NN+I,NN+J) = A(I,J)
10
CONTINUE
NN = NN + NN
20 CONTINUE
Generate right-hand-side.
DO 30 J=1, N
X(J) = J
30 CONTINUE
Set B = A*X.
CALL SGEMV (’N’, N, N, ONE, A, LDA, X, 1, ZERO, B, 1)
INTEGER
REAL
PARAMETER
!
!
!
!
!
!
!
LSLRG
Chapter 1: Linear Systems
117
!
Clear solution array.
X = ZERO
!
!
!
!
Set option to avoid memory
bank conflicts.
IOPT(1) = KBANK
CALL IUMAG (’MATH’, ICHP, IPUT, 1, IOPT, IVAL)
Solve A*X = B.
CALL L2LRG (N, A, LDA, B, IPATH, X, A, IPVT, WK)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
Output
X
1
1.00
11
11.00
2
2.00
12
12.00
3
3.00
13
13.00
4
4.00
14
14.00
5
5.00
15
15.00
6
6.00
7
7.00
8
8.00
9
9.00
10
10.00
16
16.00
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has real general form and the right-hand-side vector b has three elements. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(N,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B = (/129.0, -96.0,
ENDIF
8.5/)
LSLRG
Chapter 1: Linear Systems
118
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSLRG (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRRRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
1.000
2
1.500
3
1.000
LSLRG
Chapter 1: Linear Systems
119
LFCRG
more...
more...
Computes the LU factorization of a real general matrix and estimates its L1 condition number.
Required Arguments
A — N by N matrix to be factored. (Input)
FACT — N by N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCRG (A, FACT, IPVT, RCOND, [, …])
Specific:
The specific interface names are S_LFCRG and D_LFCRG.
FORTRAN 77 Interface
Single:
CALL LFCRG (N, A, LDA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCRG.
ScaLAPACK Interface
Generic:
CALL LFCRG (A0, FACT0, IPVT0, RCOND [, …])
Specific:
The specific interface names are S_LFCRG and D_LFCRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFCRG
Chapter 1: Linear Systems
120
Description
Routine LFCRG performs an LU factorization of a real general coefficient matrix. It also estimates the condition number of the matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The LU factorization is done using scaled partial pivoting. Scaled partial pivoting differs from partial pivoting in that the
pivoting strategy is the same as if each row were scaled to have the same ∞-norm. Otherwise, partial pivoting is used.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described in a paper by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCRG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFIRG, LFSRG and LFDRG. To solve
systems of equations with multiple right-hand-side vectors, use LFCRG followed by either LFIRG or LFSRG
called once for each right-hand side. The routine LFDRG can be called to compute the determinant of the coefficient matrix after LFCRG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the upper triangle
of F. The strict lower triangle of F contains the information needed to reconstruct L using
L-1= LN-1PN-1 … L1P1
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for
i = k + 1, …, N inserted below the diagonal. The strict lower half of F can also be thought of as containing the
negative of the multipliers. LFCRG is based on the LINPACK routine SGECO; see Dongarra et al. (1979).
SGECO uses unscaled partial pivoting.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CRG/DL2CRG. The reference is:
CALL L2CRG (N, A, LDA, FACT, LDFACT, IPVT, RCOND, WK)
The additional argument is
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is singular.
LFCRG
Chapter 1: Linear Systems
121
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix to be factored. (Input)
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the LU factorization of the matrix A. (Output)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
The inverse of a 3 × 3 matrix is computed. LFCRG is called to factor the matrix and to check for singularity or
ill-conditioning. LFIRG is called to determine the columns of the inverse.
USE
USE
USE
USE
!
LFCRG_INT
UMACH_INT
LFIRG_INT
WRRRN_INT
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), J, NOUT
A(LDA,N), AINV(LDA,N), FACT(LDFACT,N), RCOND, &
RES(N), RJ(N)
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)!
CALL LFCRG (A, FACT, IPVT, RCOND)
Print the reciprocal condition number
and the L1 condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
RJ is the J-th column of the identity
matrix so the following LFIRG
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIRG (A, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = 0.0
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
LFCRG
Chapter 1: Linear Systems
122
10 CONTINUE
!
Print results
CALL WRRRN (’AINV’, AINV)
!
99998 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. LFCRG is called to factor the
matrix and to check for singularity or ill-conditioning. LFIRG is called to determine the columns of the
inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used
to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCRG_INT
USE UMACH_INT
USE LFIRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context id, MP_ICTXT
LFCRG
Chapter 1: Linear Systems
123
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
!
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA), IPVT0(MXLDA))
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Call the factorization routine
CALL LFCRG (A0, FACT0, IPVT0, RCOND)
!
Print the reciprocal condition number
!
and the L1 condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
ENDIF
!
Set up the columns of the identity
!
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
!
RJ is the J-th column of the identity
!
matrix so the following LFIRG
!
reference computes the J-th column of
!
the inverse of A
CALL LFIRG (A0, FACT0, IPVT0, RJ0, X0, RES0)
RJ(J) = 0.0
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
!
Print results
!
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, IPVT0, FACT0, RES0, RJ, RJ0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
!
!
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
1
7.000
2
-3.000
3
-3.000
LFCRG
Chapter 1: Linear Systems
124
2
3
-1.000
-1.000
0.000
1.000
1.000
0.000
LFCRG
Chapter 1: Linear Systems
125
LFTRG
more...
more...
Computes the LU factorization of a real general matrix.
Required Arguments
A — N by N matrix to be factored. (Input)
FACT — N by N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTRG (A, FACT, IPVT [, …])
Specific:
The specific interface names are S_LFTRG and D_LFTRG.
FORTRAN 77 Interface
Single:
CALL LFTRG (N, A, LDA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTRG.
ScaLAPACK Interface
Generic:
CALL LFTRG (A0, FACT0, IPVT0 [, …])
Specific:
The specific interface names are S_LFTRG and D_LFTRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFTRG
Chapter 1: Linear Systems
126
Description
Routine LFTRG performs an LU factorization of a real general coefficient matrix. The underlying code is
based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are
used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in
the Introduction section of this manual. The LU factorization is done using scaled partial pivoting. Scaled
partial pivoting differs from partial pivoting in that the pivoting strategy is the same as if each row were
scaled to have the same norm. Otherwise, partial pivoting is used.
The routine LFTRG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A is singular or very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFIRG, LFSRG and LFDRG. To solve
systems of equations with multiple right-hand-side vectors, use LFTRG followed by either LFIRG or LFSRG
called once for each right-hand side. The routine LFDRG can be called to compute the determinant of the coefficient matrix after LFTRG has performed the factorization. Let F be the matrix FACT and let p be the vector
IPVT. The triangular matrix U is stored in the upper triangle of F. The strict lower triangle of F contains the
information needed to reconstruct L-1 using
L-1 = LN-1PN-1 . . . L1 P1
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for i = k + 1,
… N inserted below the diagonal. The strict lower half of F can also be thought of as containing the negative
of the multipliers.
Routine LFTRG is based on the LINPACK routine SGEFA. See Dongarra et al. (1979). The routine SGEFA uses
partial pivoting.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2TRG/ DL2TRG. The reference is:
CALL L2TRG (N, A, LDA, FACT, LDFACT, IPVT, WK)
The additional argument is:
WK — Work vector of length N used for scaling.
2.
Informational error
Type
Code
Description
4
2
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix to be factored. (Input)
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the LU factorization of the matrix A. (Output)
LFTRG
Chapter 1: Linear Systems
127
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
A linear system with multiple right-hand sides is solved. Routine LFTRG is called to factor the coefficient
matrix. The routine LFSRG is called to compute the two solutions for the two right-hand sides. In this case,
the coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be better to
call LFCRG to perform the factorization, and LFIRG to compute the solutions.
USE LFTRG_INT
USE LFSRG_INT
USE WRRRN_INT
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), J
A(LDA,LDA), B(N,2), FACT(LDFACT,LDFACT), X(N,2)
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A = (
(
(
1.0
1.0
1.0
B = ( 1.0
( 4.0
( -1.0
3.0
3.0
4.0
3.0)
4.0)
3.0)
10.0)
14.0)
9.0)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
DATA B/1.0, 4.0, -1.0, 10.0, 14.0, 9.0/
!
CALL LFTRG (A,
!
!
FACT,
IPVT)
Solve for the two right-hand sides
DO 10 J=1, 2
CALL LFSRG (FACT, IPVT, B(:,J), X(:,J))
10 CONTINUE
Print results
CALL WRRRN (’X’, X)
END
Output
X
1
2
LFTRG
Chapter 1: Linear Systems
128
1
2
3
-2.000
-2.000
3.000
1.000
-1.000
4.000
ScaLAPACK Example
A linear system with multiple right-hand sides is solved. Routine LFTRG is called to factor the coefficient
matrix. The routine LFSRG is called to compute the two solutions for the two right-hand sides. In this case,
the coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be better to
call LFCRG to perform the factorization, and LFIRG to compute the solutions. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTRG_INT
USE LFSRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N,2), X(N,2))
Set values for A and B
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
!
B(1,:) = (/ 1.0, 10.0/)
B(2,:) = (/ 4.0, 14.0/)
B(3,:) = (/-1.0, 9.0/)
ENDIF
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), B0(MXLDA), &
IPVT0(MXLDA))
Map input arrays to the processor grid
LFTRG
Chapter 1: Linear Systems
129
!
!
!
!
!
!
!
!
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the factorization routine
CALL LFTRG (A0, FACT0, IPVT0)
Set up the columns of the B
matrix one at a time in X0
DO 10 J=1, 2
CALL SCALAPACK_MAP(B(:,j), DESCL, B0)
Solve for the J-th column of X
CALL LFSRG (FACT0, IPVT0, B0, X0)
CALL SCALAPACK_UNMAP(X0, DESCL, X(:,J))
10 CONTINUE
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRRRN (’X’, X)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, IPVT0, FACT0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
1
2
3
X
1
2
-2.000
1.000
-2.000 -1.000
3.000
4.000
LFTRG
Chapter 1: Linear Systems
130
LFSRG
more...
more...
Solves a real general system of linear equations given the LU factorization of the coefficient matrix.
Required Arguments
FACT — N by N matrix containing the LU factorization of the coefficient matrix A as output from routine
LFCRG or LFTRG. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
subroutine LFCRG or LFTRG. (Input).
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT, 2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT, 1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSRG (FACT, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSRG and D_LFSRG.
FORTRAN 77 Interface
Single:
CALL LFSRG (N, FACT, LDFACT, IPVT, B, IPATH, X)
Double:
The double precision name is DLFSRG.
ScaLAPACK Interface
Generic:
CALL LFSRG (FACT0, IPVT0, B0, X0 [, …])
LFSRG
Chapter 1: Linear Systems
131
Specific:
The specific interface names are S_LFSRG and D_LFSRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFSRG computes the solution of a system of linear algebraic equations having a real general coefficient matrix. To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCRG or LFTRG. The solution to Ax = b is found by solving the triangular systems
Ly = b and Ux = y. The forward elimination step consists of solving the system Ly = b by applying the same
permutations and elimination operations to b that were applied to the columns of A in the factorization routine. The backward substitution step consists of solving the triangular system Ux = y for x.
LFSRG and LFIRG both solve a linear system given its LU factorization. LFIRG generally takes more time
and produces a more accurate answer than LFSRG. Each iteration of the iterative refinement algorithm used
by LFIRG calls LFSRG. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT as
output from routine LFCRG. FACT contains the LU factorization of the matrix A. (Input)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization as output from subroutine LFCRG or
LFTRG/DLFTRG. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
The inverse is computed for a real general 3 × 3 matrix. The input matrix is assumed to be well-conditioned,
hence, LFTRG is used rather than LFCRG.
USE LFSRG_INT
USE LFTRG_INT
USE WRRRN_INT
LFSRG
Chapter 1: Linear Systems
132
!
Declare variables
(LDA=3, LDFACT=3, N=3)
I, IPVT(N), J
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
PARAMETER
INTEGER
REAL
!
!
A(1,:) = (/ 1.0,
A(2,:) = (/ 1.0,
A(3,:) = (/ 1.0,
3.0,
3.0,
4.0,
Set values for A
3.0/)
4.0/)
3.0/)
!
CALL LFTRG (A, FACT, IPVT)
!
!
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
!
!
!
!
!
!
RJ is the J-th column of the identity
matrix so the following LFSRG
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSRG (FACT, IPVT, RJ, AINV(:,J))
RJ(J) = 0.0
10 CONTINUE
Print results
CALL WRRRN (’AINV’, AINV)
END
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. The input matrix is assumed to
be well-conditioned, hence, LFTRG is used rather than LFCRG. LFSRG is called to determine the columns of
the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”)
used to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTRG_INT
USE UMACH_INT
USE LFSRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
LFSRG
Chapter 1: Linear Systems
133
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RJ0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), IPVT0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the factorization routine
CALL LFTRG (A0, FACT0, IPVT0)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIRG
reference computes the J-th column of
the inverse of A
CALL LFSRG (FACT0, IPVT0, RJ0, X0)
RJ(J) = 0.0
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
Print results
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, IPVT0, FACT0, RJ, RJ0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
LFSRG
Chapter 1: Linear Systems
134
END
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
LFSRG
Chapter 1: Linear Systems
135
LFIRG
more...
more...
Uses iterative refinement to improve the solution of a real general system of linear equations.
Required Arguments
A — N by N matrix containing the coefficient matrix of the linear system. (Input)
FACT — N by N matrix containing the LU factorization of the coefficient matrix A as output from routine
LFCRG/DLFCRG or LFTRG/DLFTRG. (Input).
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCRG/DLFCRG or LFTRG/DLFTRG. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input).
X — Vector of length N containing the solution to the linear system. (Output)
RES — Vector of length N containing the final correction at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system A * X = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFIRG (A, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIRG and D_LFIRG.
FORTRAN 77 Interface
Single:
CALL LFIRG (N, A, LDA, FACT, LDFACT, IPVT, B, IPATH, X, RES)
Double:
The double precision name is DLFIRG.
LFIRG
Chapter 1: Linear Systems
136
ScaLAPACK Interface
Generic:
CALL LFIRG (A0, FACT0, IPVT0, B0, X0, RES0 [, …])
Specific:
The specific interface names are S_LFIRG and D_LFIRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFIRG computes the solution of a system of linear algebraic equations having a real general coefficient matrix. Iterative refinement is performed on the solution vector to improve the accuracy. Usually
almost all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see “Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK” in the Introduction section of this manual.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCRG or LFTRG.
Iterative refinement fails only if the matrix is very ill-conditioned.
Routines LFIRG and LFSRG both solve a linear system given its LU factorization. LFIRG generally takes
more time and produces a more accurate answer than LFSRG. Each iteration of the iterative refinement algorithm used by LFIRG calls LFSRG.
Comments
Informational error
Type
Code
Description
3
2
The input matrix is too ill-conditioned for iterative refinement to be effective.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficient matrix of the linear system. (Input)
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT as
output from routine LFCRG or LFTRG. FACT contains the LU factorization of the matrix A. (Input)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization as output from subroutine LFCRG or LFTRG.
(Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES0 — Local vector of length MXLDA containing the local portions of the distributed vector RES. RES
contains the final correction at the improved solution to the linear system. (Output)
LFIRG
Chapter 1: Linear Systems
137
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIRG_INT
LFCRG_INT
UMACH_INT
WRRRN_INT
!
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), B(N), FACT(LDFACT,LDFACT), RCOND, RES(N), X(N)
Set values for A and B
A = (
(
(
1.0
1.0
1.0
3.0
3.0
4.0
3.0)
4.0)
3.0)
B = ( -0.5
-1.0
1.5)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
DATA B/-0.5, -1.0, 1.5/
!
!
!
!
!
CALL LFCRG (A, FACT, IPVT, RCOND)
Print the reciprocal condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Solve the three systems
DO 10 J=1, 3
CALL LFIRG (A, FACT, IPVT, B, X, RES)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
Perturb B by adding 0.5 to B(2)
B(2) = B(2) + 0.5
10 CONTINUE
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < 0.02
LFIRG
Chapter 1: Linear Systems
138
L1 Condition number < 100.0
X
1
2
3
-5.000
2.000 -0.500
X
1
2
3
-6.500
2.000
0.000
X
1
-8.000
2
2.000
3
0.500
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-hand side vector is
perturbed after solving the system each of the first two times by adding 0.5 to the second element.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFIRG_INT
USE UMACH_INT
USE LFCRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:), X0(:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), B0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
!
B(:) =
ENDIF
!
!
!
!
!
(/-0.5, -1.0,
1.5/)
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
LFIRG
Chapter 1: Linear Systems
139
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), &
B0(MXLDA), RES0(MXLDA), IPVT0(MXLDA))
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Call the factorization routine
CALL LFCRG (A0, FACT0, IPVT0, RCOND)
!
Print the reciprocal condition number
!
and the L1 condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
ENDIF
!
Solve the three systems
!
one at a time in X
DO 10 J=1, 3
CALL SCALAPACK_MAP(B, DESCL, B0)
CALL LFIRG (A0, FACT0, IPVT0, B0, X0, RES0)
CALL SCALAPACK_UNMAP(X0, DESCL, X)
!
Print results
!
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRRRN (’X’, X, 1, N, 1)
IF(MP_RANK.EQ.0) B(2) = B(2) + 0.5
10 CONTINUE
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV, B)
DEALLOCATE(A0, B0, IPVT0, FACT0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.02
L1 Condition number < 100.0
X
1
-5.000
1
-6.500
2
2.000
X
2
2.000
3
-0.500
3
0.000
X
1
-8.000
2
2.000
3
0.500
LFIRG
Chapter 1: Linear Systems
140
LFDRG
Computes the determinant of a real general matrix given the LU factorization of the matrix.
Required Arguments
FACT — N by N matrix containing the LU factorization of the matrix A as output from routine
LFTRG/DLFTRG or LFCRG/DLFCRG. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization as output from
routine LFTRG/DLFTRG or LFCRG/DLFCRG. (Input).
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ |DET1| < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDRG (FACT, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDRG and D_LFDRG.
FORTRAN 77 Interface
Single:
CALL LFDRG (N, FACT, LDFACT, IPVT, DET1, DET2)
Double:
The double precision name is DLFDRG.
Description
Routine LFDRG computes the determinant of a real general coefficient matrix. To compute the determinant,
the coefficient matrix must first undergo an LU factorization. This may be done by calling either LFCRG or
LFTRG. The formula det A = det L det U is used to compute the determinant. Since the determinant of a triangular matrix is the product of the diagonal elements
GHW8
š
1
8 LL
L (The matrix U is stored in the upper triangle of FACT.) Since L is the product of triangular matrices with unit
diagonals and of permutation matrices, det L = (−1)k where k is the number of pivoting interchanges.
LFDRG
Chapter 1: Linear Systems
141
Routine LFDRG is based on the LINPACK routine SGEDI; see Dongarra et al. (1979)
Example
The determinant is computed for a real general 3 × 3 matrix.
USE LFDRG_INT
USE LFTRG_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), DET1, DET2, FACT(LDFACT,LDFACT)
!
!
!
!
!
!
Set values for A
A = ( 33.0 16.0 72.0)
(-24.0 -10.0 -57.0)
( 18.0 -11.0
7.0)
DATA A/33.0, -24.0, 18.0, 16.0, -10.0, -11.0, 72.0, -57.0, 7.0/
!
CALL LFTRG (A, FACT, IPVT)
!
!
Compute the determinant
CALL LFDRG (FACT, IPVT, DET1, DET2)
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is ’, F6.3, ’ * 10**’, F2.0)
END
Output
The determinant of A is -4.761 * 10**3.
LFDRG
Chapter 1: Linear Systems
142
LINRG
more...
more...
Computes the inverse of a real general matrix.
Required Arguments
A — N by N matrix containing the matrix to be inverted. (Input)
AINV — N by N matrix containing the inverse of A. (Output)
If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N — Order of the matrix A. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDAINV — Leading dimension of AINV exactly as specified in the dimension statement of the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
CALL LINRG (A, AINV [, …])
Specific:
The specific interface names are S_LINRG and D_LINRG.
FORTRAN 77 Interface
Single:
CALL LINRG (N, A, LDA, AINV, LDAINV)
Double:
The double precision name is DLINRG.
ScaLAPACK Interface
Generic:
CALL LINRG (A0, AINV0 [, …])
Specific:
The specific interface names are S_LINRG and D_LINRG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LINRG
Chapter 1: Linear Systems
143
Description
Routine LINRG computes the inverse of a real general matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking.
For a detailed explanation see “Using ScaLAPACK, LAPACK, LINPACK, and EISPACK” in the Introduction section of this manual. LINRG first uses the routine LFCRG to compute an LU factorization of the coefficient
matrix and to estimate the condition number of the matrix. Routine LFCRG computes U and the information
needed to compute L-1. LINRT is then used to compute U-1. Finally, A-1 is computed using A-1 = U-1L-1.
The routine LINRG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if
the iterative refinement algorithm fails to converge. This error occurs only if A is singular or very close to a
singular matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in A-1.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2NRG/DL2NRG. The reference is:
CALL L2NRG (N, A, LDA, AINV, LDAINV, WK, IWK)
The additional arguments are as follows:
WK — Work vector of length N + N(N − 1)/2.
IWK — Integer work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The inverse might not be accurate.
4
2
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix to be inverted. (Input)
AINV0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix AINV.
AINV contains the inverse of the matrix A. (Output)
If A is not needed, A and AINV can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LINRG
Chapter 1: Linear Systems
144
Examples
Example
The inverse is computed for a real general 3 × 3 matrix.
USE LINRG_INT
USE WRRRN_INT
!
Declare variables
(LDA=3, LDAINV=3)
I, J, NOUT
A(LDA,LDA), AINV(LDAINV,LDAINV)
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
Set values for A
A = ( 1.0
3.0
( 1.0
3.0
( 1.0
4.0
3.0)
4.0)
3.0)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
!
CALL LINRG (A, AINV)
!
Print results
CALL WRRRN (’AINV’, AINV)
END
Output
1
2
3
1
7.000
-1.000
-1.000
AINV
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LINRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
INTEGER
LDA, LDAINV, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
PARAMETER (LDA=3, LDAINV=3, N=3)
LINRG
Chapter 1: Linear Systems
145
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDAINV,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), AINV0(MXLDA,MXCOL))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Get the inverse
CALL LINRG (A0, AINV0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(AINV0, DESCA, AINV)
Print results
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, AINV0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
LINRG
Chapter 1: Linear Systems
146
LSACG
more...
more...
Solves a complex general system of linear equations with iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficients of the linear system. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1
FORTRAN 90 Interface
Generic:
CALL LSACG (A, B, X [, …])
Specific:
The specific interface names are S_LSACG and D_LSACG.
FORTRAN 77 Interface
Single:
CALL LSACG (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSACG.
ScaLAPACK Interface
Generic:
CALL LSACG (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSACG and D_LSACG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSACG
Chapter 1: Linear Systems
147
Description
Routine LSACG solves a system of linear algebraic equations with a complex general coefficient matrix. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual. LSACG first uses the routine LFCCG to
compute an LU factorization of the coefficient matrix and to estimate the condition number of the matrix. The
solution of the linear system is then found using the iterative refinement routine LFICG.
LSACG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A is singular or very close to a singular
matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSACG solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ACG/DL2ACG. The reference is:
CALL L2ACG (N, A, LDA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — Complex work vector of length N2containing the LU factorization of A on output.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11, Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ACG the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2);
respectively, in LSACG. Additional memory allocation for FACT and option value restoration are
done automatically in LSACG. Users directly calling L2ACG can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSACG or L2ACG.
Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1condition number is to be computed. Routine
LSACG temporarily replaces IVAL(2) by IVAL(1). The routine L2CCG computes the condition
number if IVAL(2) = 2. Otherwise L2CCG skips this computation. LSACG restores the option.
Default values for the option are IVAL(*) = 1, 2.
LSACG
Chapter 1: Linear Systems
148
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficients of the linear system. (Input)
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
A system of three linear equations is solved. The coefficient matrix has complex general form and the righthand-side vector b has three elements.
USE LSACG_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A = ( 3.0-2.0i 2.0+4.0i
( 1.0+1.0i 2.0-6.0i
( 4.0+0.0i -5.0+1.0i
B = (10.0+5.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
6.0-7.0i -1.0+2.0i)
DATA A/(3.0,-2.0), (1.0,1.0), (4.0,0.0), (2.0,4.0), (2.0,-6.0), &
(-5.0,1.0), (0.0,-3.0), (1.0,2.0), (3.0,-2.0)/
DATA B/(10.0,5.0), (6.0,-7.0), (-1.0,2.0)/
Solve AX = B
(IPATH = 1)
CALL LSACG (A, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSACG
Chapter 1: Linear Systems
149
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has complex general form and the right-hand-side vector b has three elements. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSACG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ (3.0, -2.0), (2.0, 4.0), (0.0, -3.0)/)
A(2,:) = (/ (1.0, 1.0), (2.0, -6.0), (1.0, 2.0)/)
A(3,:) = (/ (4.0, 0.0), (-5.0, 1.0), (3.0, -2.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
B = (/(10.0, 5.0), (6.0, -7.0), (-1.0, 2.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSACG (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRCRN (’X’, X, 1, N, 1)
LSACG
Chapter 1: Linear Systems
150
!
!
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSACG
Chapter 1: Linear Systems
151
LSLCG
more...
more...
Solves a complex general system of linear equations without iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficients of the linear system. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1
FORTRAN 90 Interface
Generic:
CALL LSLCG (A, B, X [, …])
Specific:
The specific interface names are S_LSLCG and D_LSLCG.
FORTRAN 77 Interface
Single:
CALL LSLCG (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSLCG.
ScaLAPACK Interface
Generic:
CALL LSLCG (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSLCG and D_LSLCG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSLCG
Chapter 1: Linear Systems
152
Description
Routine LSLCG solves a system of linear algebraic equations with a complex general coefficient matrix. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual. LSLCG first uses the routine LFCCG to
compute an LU factorization of the coefficient matrix and to estimate the condition number of the matrix. The
solution of the linear system is then found using LFSCG.
LSLCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This occurs only
if A either is a singular matrix or is very close to a singular matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSACG be used.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LCG/DL2LCG. The reference is:
CALL L2LCG (N, A, LDA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — N × N work array containing the LU factorization of A on output. If A is not needed, A and
FACT can share the same storage locations.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11, Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LCG the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2);
respectively, in LSLCG. Additional memory allocation for FACT and option value restoration are
done automatically in LSLCG. Users directly calling L2LCG can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLCG or L2LCG.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLCG temporarily replaces IVAL(2) by IVAL(1). The routine L2CCG computes the condition
number if IVAL(2) = 2. Otherwise L2CCG skips this computation. LSLCG restores the option.
Default values for the option are IVAL(*) = 1, 2.
LSLCG
Chapter 1: Linear Systems
153
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficients of the linear system. (Input)
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
A system of three linear equations is solved. The coefficient matrix has complex general form and the righthand-side vector b has three elements.
USE LSLCG_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A = ( 3.0-2.0i 2.0+4.0i
( 1.0+1.0i 2.0-6.0i
( 4.0+0.0i -5.0+1.0i
B = (10.0+5.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
6.0-7.0i -1.0+2.0i)
DATA A/(3.0,-2.0), (1.0,1.0), (4.0,0.0), (2.0,4.0), (2.0,-6.0),&
(-5.0,1.0), (0.0,-3.0), (1.0,2.0), (3.0,-2.0)/
DATA B/(10.0,5.0), (6.0,-7.0), (-1.0,2.0)/
Solve AX = B
(IPATH = 1)
CALL LSLCG (A, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSLCG
Chapter 1: Linear Systems
154
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has complex general form and the right-hand-side vector b has three elements. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ (3.0, -2.0), (2.0, 4.0), (0.0, -3.0)/)
A(2,:) = (/ (1.0, 1.0), (2.0, -6.0), (1.0, 2.0)/)
A(3,:) = (/ (4.0, 0.0), (-5.0, 1.0), (3.0, -2.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
B = (/(10.0, 5.0), (6.0, -7.0), (-1.0, 2.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSLCG (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
LSLCG
Chapter 1: Linear Systems
155
!
!
IF(MP_RANK .EQ. 0)CALL WRCRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSLCG
Chapter 1: Linear Systems
156
LFCCG
more...
more...
Computes the LU factorization of a complex general matrix and estimate its L1 condition number.
Required Arguments
A — Complex N by N matrix to be factored. (Input)
FACT — Complex N × N matrix containing the LU factorization of the matrix A (Output)
If A is not needed, A and FACT can share the same storage locations
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCCG (A, FACT, IPVT, RCOND [, …])
Specific:
The specific interface names are S_LFCCG and D_LFCCG.
FORTRAN 77 Interface
Single:
CALL LFCCG (N, A, LDA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCCG.
ScaLAPACK Interface
Generic:
CALL LFCCG (A0, FACT0, IPVT0, RCOND [, …])
Specific:
The specific interface names are S_LFCCG and D_LFCCG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFCCG
Chapter 1: Linear Systems
157
Description
Routine LFCCG performs an LU factorization of a complex general coefficient matrix. It also estimates the
condition number of the matrix. The underlying code is based on either LINPACK, LAPACK, or ScaLAPACK
code depending upon which supporting libraries are used during linking. For a detailed explanation see
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The LU factorization is done using scaled partial pivoting. Scaled partial pivoting differs from partial pivoting in that
the pivoting strategy is the same as if each row were scaled to have the same ∞-norm.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFICG, LFSCG and LFDCG. To solve
systems of equations with multiple right-hand-side vectors, use LFCCG followed by either LFICG or LFSCG
called once for each right-hand side. The routine LFDCG can be called to compute the determinant of the coefficient matrix after LFCCG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the upper triangle
of F. The strict lower triangle of F contains the information needed to reconstruct L using
L11= LN-1PN-1 … L1P1
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for
i = k + 1, …, N inserted below the diagonal. The strict lower half of F can also be thought of as containing the
negative of the multipliers.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CCG/DL2CCG. The reference is:
CALL L2CCG (N, A, LDA, FACT, LDFACT, IPVT, RCOND, WK)
The additional argument is:
WK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is singular.
LFCCG
Chapter 1: Linear Systems
158
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the matrix to be factored. (Input)
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT. FACT contains the LU factorization of the matrix A. (Output)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example 1
The inverse of a 3 × 3 matrix is computed. LFCCG is called to factor the matrix and to check for singularity or
ill-conditioning. LFICG is called to determine the columns of the inverse.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
RCOND, THIRD
A(LDA,N), AINV(LDA,N), RJ(N), FACT(LDFACT,N), RES(N)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
DATA A/(1.0,1.0), (2.0,1.0), (-2.0,1.0), (2.0,3.0), (5.0,3.0),&
(-4.0,4.0), (3.0,3.0), (7.0,4.0), (-5.0,3.0)/
!
!
!
!
!
Scale A by dividing by three
THIRD = 1.0/3.0
DO 10 I=1, N
CALL CSSCAL (N, THIRD, A(:,I), 1)
10 CONTINUE
Factor A
CALL LFCCG (A, FACT, IPVT, RCOND)
Print the L1 condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Set up the columns of the identity
LFCCG
Chapter 1: Linear Systems
159
!
!
!
!
!
!
!
matrix one at a time in RJ
CALL CSET (N, (0.0,0.0), RJ, 1)
DO 20 J=1, N
RJ(J) = CMPLX(1.0,0.0)
RJ is the J-th column of the identity
matrix so the following LFIRG
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFICG (A, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = CMPLX(0.0,0.0)
20 CONTINUE
Print results
CALL WRCRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. LFCCG is called to factor the
matrix and to check for singularity or ill-conditioning. LFICG is called to determine the columns of the
inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used
to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCCG_INT
USE UMACH_INT
USE LFICG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND, THIRD
PARAMETER (LDA=3, N=3)
LFCCG
Chapter 1: Linear Systems
160
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
Scale A by dividing by three
THIRD = 1.0/3.0
A = A * THIRD
ENDIF
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA), IPVT0(MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
CALL LFCCG (A0, FACT0, IPVT0, RCOND)
Print the reciprocal condition number
and the L1 condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
ENDIF
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0, 0.0)
DO 10 J=1, N
RJ(J) = (1.0, 0.0)
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
RJ is the J-th column of the identity
matrix so the following LFICG
reference computes the J-th column of
the inverse of A
CALL LFICG (A0, FACT0, IPVT0, RJ0, X0, RES0)
RJ(J) = (0.0, 0.0)
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
Print results
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRCRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, IPVT0, RJ, RJ0, RES0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
LFCCG
Chapter 1: Linear Systems
161
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LFCCG
Chapter 1: Linear Systems
162
LFTCG
more...
more...
Computes the LU factorization of a complex general matrix.
Required Arguments
A — Complex N by N matrix to be factored. (Input)
FACT — Complex N × N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTCG (A, FACT, IPVT [, …])
Specific:
The specific interface names are S_LFTCG and D_LFTCG.
FORTRAN 77 Interface
Single:
CALL LFTCG (N, A, LDA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTCG.
ScaLAPACK Interface
Generic:
CALL LFTCG (A0, FACT0, IPVT0 [, …])
Specific:
The specific interface names are S_LFTCG and D_LFTCG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFTCG
Chapter 1: Linear Systems
163
Description
Routine LFTCG performs an LU factorization of a complex general coefficient matrix. The LU factorization is
done using scaled partial pivoting. Scaled partial pivoting differs from partial pivoting in that the pivoting
strategy is the same as if each row were scaled to have the same ’ í QRUP.
LFTCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFICG, LFSCG and LFDCG. To solve
systems of equations with multiple right-hand-side vectors, use LFTCG followed by either LFICG or LFSCG
called once for each right-hand side. The routine LFDCG can be called to compute the determinant of the coefficient matrix after LFCCG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the upper triangle
of F. The strict lower triangle of F contains the information needed to reconstruct L using
L = LN-1PN-1 … L1P1
where Pk is the identity matrix with rows k and Pk interchanged and Lk is the identity with Fik for
i = k + 1, …, N inserted below the diagonal. The strict lower half of F can also be thought of as containing the
negative of the multipliers.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see “Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK” in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2TCG/DL2TCG. The reference is:
CALL L2TCG (N, A, LDA, FACT, LDFACT, IPVT, WK)
The additional argument is:
WK — Complex work vector of length N.
2.
Informational error
Type
Code
Description
4
2
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the matrix to be factored. (Input)
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT. FACT contains the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization. (Output)
LFTCG
Chapter 1: Linear Systems
164
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A linear system with multiple right-hand sides is solved. LFTCG is called to factor the coefficient matrix.
LFSCG is called to compute the two solutions for the two right-hand sides. In this case the coefficient matrix
is assumed to be well-conditioned and correctly scaled. Otherwise, it would be better to call LFCCG to perform the factorization, and LFICG to compute the solutions.
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N)
A(LDA,LDA), B(N,2), X(N,2), FACT(LDFACT,LDFACT)
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0-3.0i)
( 2.0+1.0i 5.0+3.0i 7.0-5.0i)
(-2.0+1.0i -4.0+4.0i 5.0+3.0i)
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
DATA A/(1.0,1.0), (2.0,1.0), (-2.0,1.0), (2.0,3.0), (5.0,3.0),&
(-4.0,4.0), (3.0,-3.0), (7.0,-5.0), (5.0,3.0)/
!
!
!
!
!
!
Set the right-hand sides, B
B = ( 3.0+ 5.0i 9.0+ 0.0i)
( 22.0+10.0i 13.0+ 9.0i)
(-10.0+ 4.0i 6.0+10.0i)
DATA B/(3.0,5.0), (22.0,10.0), (-10.0,4.0), (9.0,0.0),&
(13.0,9.0), (6.0,10.0)/
!
!
Factor A
CALL LFTCG (A, FACT, IPVT)
!
!
Solve for the two right-hand sides
DO 10 J=1, 2
CALL LFSCG (FACT, IPVT, B(:,J), X(:,J))
10 CONTINUE
Print results
CALL WRCRN (’X’, X)
END
Output
X
1
2
LFTCG
Chapter 1: Linear Systems
165
1
2
3
( 1.000,-1.000)
( 2.000, 4.000)
( 3.000, 0.000)
( 0.000, 2.000)
(-2.000,-1.000)
( 1.000, 3.000)
ScaLAPACK Example
The same linear system with multiple right-hand sides is solved as a distributed example. LFTCG is called to
factor the matrix. LFSCG is called to compute the two solutions for the two right-hand sides.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N,2), X(N,2))
Set values for A and B
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0,-3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0,-5.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), ( 5.0, 3.0)/)
!
!
!
!
!
!
!
!
!
B(1,:) = (/ ( 3.0, 5.0), ( 9.0, 0.0)/)
B(2,:) = (/ ( 22.0, 10.0), (13.0, 9.0)/)
B(3,:) = (/ (-10.0, 4.0), ( 6.0, 10.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), &
B0(MXLDA), IPVT0(MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
LFTCG
Chapter 1: Linear Systems
166
!
!
!
!
!
CALL LFTCG (A0, FACT0, IPVT0)
Solve for the two right-hand sides
DO 10 J=1, 2
CALL SCALAPACK_MAP(B(:,J), DESCL, B0)
CALL LFSCG (FACT0, IPVT0, B0, X0)
CALL SCALAPACK_UNMAP(X0, DESCL, X(:,J))
10 CONTINUE
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRCRN (’X’, X)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, FACT0, IPVT0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
2
3
1
( 1.000,-1.000)
( 2.000, 4.000)
( 3.000, 0.000)
2
( 0.000, 2.000)
(-2.000,-1.000)
( 1.000, 3.000)
LFTCG
Chapter 1: Linear Systems
167
LFSCG
more...
more...
Solves a complex general system of linear equations given the LU factorization of the coefficient matrix.
Required Arguments
FACT — Complex N by N matrix containing the LU factorization of the coefficient matrix A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSCG (FACT, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSCG and D_LFSCG.
FORTRAN 77 Interface
Single:
CALL LFSCG (N, FACT, LDFACT, IPVT, B, IPATH, X)
Double:
The double precision name is DLFSCG.
LFSCG
Chapter 1: Linear Systems
168
ScaLAPACK Interface
Generic:
CALL LFSCG (FACT0, IPVT0, B0, X0 [, …])
Specific:
The specific interface names are S_LFSCG and D_LFSCG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFSCG computes the solution of a system of linear algebraic equations having a complex general
coefficient matrix. To compute the solution, the coefficient matrix must first undergo an LU factorization.
This may be done by calling either LFCCG or LFTCG. The solution to Ax = b is found by solving the triangular
systems Ly = b and Ux = y. The forward elimination step consists of solving the system Ly = b by applying the
same permutations and elimination operations to b that were applied to the columns of A in the factorization
routine. The backward substitution step consists of solving the triangular system Ux = y for x.
Routines LFSCG and LFICG both solve a linear system given its LU factorization. LFICG generally takes
more time and produces a more accurate answer than LFSCG. Each iteration of the iterative refinement algorithm used by LFICG calls LFSCG.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. FACT contains the LU factorization
of the matrix A. (Input)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization as output from subroutine LFCCG/DLFCCG
or LFTCG/DLFTCG. (Input)
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LFSCG
Chapter 1: Linear Systems
169
Examples
Example
The inverse is computed for a complex general 3 × 3 matrix. The input matrix is assumed to be well-conditioned, hence LFTCG is used rather than LFCCG.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N)
THIRD
A(LDA,LDA), AINV(LDA,LDA), RJ(N), FACT(LDFACT,LDFACT)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
DATA A/(1.0,1.0), (2.0,1.0), (-2.0,1.0), (2.0,3.0), (5.0,3.0),&
(-4.0,4.0), (3.0,3.0), (7.0,4.0), (-5.0,3.0)/
!
!
!
!
!
!
!
!
!
!
!
Scale A by dividing by three
THIRD = 1.0/3.0
DO 10 I=1, N
CALL CSSCAL (N, THIRD, A(:,I), 1)
10 CONTINUE
Factor A
CALL LFTCG (A, FACT, IPVT)
Set up the columns of the identity
matrix one at a time in RJ
CALL CSET (N, (0.0,0.0), RJ, 1)
DO 20 J=1, N
RJ(J) = CMPLX(1.0,0.0)
RJ is the J-th column of the identity
matrix so the following LFSCG
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSCG (FACT, IPVT, RJ, AINV(:,J))
RJ(J) = CMPLX(0.0,0.0)
20 CONTINUE
Print results
CALL WRCRN (’AINV’, AINV)
END
Output
AINV
1
1
( 6.400,-2.800)
2
(-3.800, 2.600)
3
(-2.600, 1.200)
LFSCG
Chapter 1: Linear Systems
170
2
3
(-1.600,-1.800)
(-0.600, 2.200)
( 0.200, 0.600)
( 1.200,-1.400)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. The input matrix is assumed to
be well-conditioned, hence LFTCG is used rather than LFCCG. LFSCG is called to determine the columns of
the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”)
used to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RJ(:), RJ0(:)
REAL
THIRD
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
Scale A by dividing by three
THIRD = 1.0/3.0
A = A * THIRD
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), IPVT0(MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
CALL LFTCG (A0, FACT0, IPVT0)
LFSCG
Chapter 1: Linear Systems
171
!
!
!
!
!
!
!
!
!
!
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0, 0.0)
DO 10 J=1, N
RJ(J) = (1.0, 0.0)
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
RJ is the J-th column of the identity
matrix so the following LFICG
reference computes the J-th column of
the inverse of A
CALL LFSCG (FACT0, IPVT0, RJ0, X0)
RJ(J) = (0.0, 0.0)
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
Print results.
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRCRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, IPVT0, RJ, RJ0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LFSCG
Chapter 1: Linear Systems
172
LFICG
more...
more...
Uses iterative refinement to improve the solution of a complex general system of linear equations.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the linear system. (Input)
FACT — Complex N by N matrix containing the LU factorization of the coefficient matrix A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
RES — Complex vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFICG (A, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFICG and D_LFICG.
FORTRAN 77 Interface
Single:
CALL LFICG (N, A, LDA, FACT, LDFACT, IPVT, B, IPATH, X, RES)
Double:
The double precision name is DLFICG.
LFICG
Chapter 1: Linear Systems
173
ScaLAPACK Interface
Generic:
CALL LFICG (A0, FACT0, IPVT0, B0, X0, RES0 [, …])
Specific:
The specific interface names are S_LFICG and D_LFICG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFICG computes the solution of a system of linear algebraic equations having a complex general
coefficient matrix. Iterative refinement is performed on the solution vector to improve the accuracy. Usually
almost all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCCG, or LFTCG.
Iterative refinement fails only if the matrix is very ill-conditioned. Routines LFICG and LFSCG both solve a
linear system given its LU factorization. LFICG generally takes more time and produces a more accurate
answer than LFSCG. Each iteration of the iterative refinement algorithm used by LFICG calls LFSCG.
Comments
Informational error
Type
Code
Description
3
2
The input matrix is too ill-conditioned for iterative refinement to be effective
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the linear system. (Input)
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT as output from routineLFCCG, or LFTCG. FACT contains the LU factorization of the matrix A.
(Input)
IPVT0 — Local vector of length MXLDA containing the local portions of the distributed vector IPVT. IPVT
contains the pivoting information for the LU factorization as output from subroutine LFCCG, or
LFTCG. (Input)
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
RES0 — Complex local vector of length MXLDA containing the local portions of the distributed vector
RES. RES contains the final correction at the improved solution to the linear system. (Output)
LFICG
Chapter 1: Linear Systems
174
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.5 + 0.5i to the second element.
USE
USE
USE
USE
LFICG_INT
LFCCG_INT
WRCRN_INT
UMACH_INT
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), B(N), X(N), FACT(LDFACT,LDFACT), RES(N)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i
( 2.0+1.0i 5.0+3.0i
( -2.0+1.0i -4.0+4.0i
3.0-3.0i)
7.0-5.0i)
5.0+3.0i)
DATA A/(1.0,1.0), (2.0,1.0), (-2.0,1.0), (2.0,3.0), (5.0,3.0), &
(-4.0,4.0), (3.0,-3.0), (7.0,-5.0), (5.0,3.0)/
!
!
!
!
!
!
!
!
!
Set values for B
B = ( 3.0+5.0i 22.0+10.0i -10.0+4.0i)
DATA B/(3.0,5.0), (22.0,10.0), (-10.0,4.0)/
Factor A
CALL LFCCG (A, FACT, IPVT, RCOND)
Print the L1 condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Solve the three systems
DO 10 J=1, 3
CALL LFICG (A, FACT, IPVT, B, X, RES)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
Perturb B by adding 0.5+0.5i to B(2)
B(2) = B(2) + CMPLX(0.5,0.5)
10 CONTINUE
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
LFICG
Chapter 1: Linear Systems
175
Output
RCOND < 0.025
L1 Condition number < 75.0
X
1
2
( 1.000,-1.000) ( 2.000, 4.000)
3
( 3.000, 0.000)
X
1
( 0.910,-1.061)
2
( 1.986, 4.175)
3
( 3.123, 0.071)
X
1
( 0.821,-1.123)
2
( 1.972, 4.349)
3
( 3.245, 0.142)
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-hand-side vector is
perturbed after solving the system each of the first two times by adding 0.5 + 0.5i to the second element.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFICG_INT
USE LFCCG_INT
USE WRCRN_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:), X0(:), RES(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:), RES0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N), RES(N))
Set values for A and B
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
!
B
!
!
= (/ (3.0,
ENDIF
5.0), (22.0, 10.0), (-10.0,
4.0)/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
LFICG
Chapter 1: Linear Systems
176
!
!
!
!
!
!
!
!
!
!
!
!
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), &
B0(MXLDA), IPVT0(MXLDA), RES0(MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
CALL LFCCG (A0, FACT0, IPVT0, RCOND)
Print the L1 condition number
IF (MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
ENDIF
Solve the three systems
DO 10 J=1, 3
CALL SCALAPACK_MAP(B, DESCL, B0)
CALL LFICG (A0, FACT0, IPVT0, B0, X0, RES0)
CALL SCALAPACK_UNMAP(X0, DESCL, X)
Print results
Only Rank=0 has the solution, X.
IF (MP_RANK .EQ. 0) CALL WRCRN (’X’, X, 1, N, 1)
Perturb B by adding 0.5+0.5i to B(2)
IF(MP_RANK .EQ. 0) B(2) = B(2) + (0.5,0.5)
10 CONTINUE
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X, RES)
DEALLOCATE(A0, B0, FACT0, IPVT0, X0, RES0)
Exit Scalapack usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.025
L1 Condition number < 75.0
X
1
2
( 1.000,-1.000) ( 2.000, 4.000)
3
( 3.000, 0.000)
X
1
( 0.910,-1.061)
2
( 1.986, 4.175)
3
( 3.123, 0.071)
X
1
( 0.821,-1.123)
2
( 1.972, 4.349)
3
( 3.245, 0.142)
LFICG
Chapter 1: Linear Systems
177
LFDCG
Computes the determinant of a complex general matrix given the LU factorization of the matrix.
Required Arguments
FACT — Complex N by N matrix containing the LU factorization of the coefficient matrix A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
DET1 — Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ |DET1| < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDCG (FACT, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDCG and D_LFDCG.
FORTRAN 77 Interface
Single:
CALL LFDCG (N, FACT, LDFACT, IPVT, DET1, DET2)
Double:
The double precision name is DLFDCG.
Description
Routine LFDCG computes the determinant of a complex general coefficient matrix. To compute the determinant the coefficient matrix must first undergo an LU factorization. This may be done by calling either LFCCG
or LFTCG. The formula det A = det L det U is used to compute the determinant. Since the determinant of a
triangular matrix is the product of the diagonal elements,
GHW8
š
1
8 LL
L (The matrix U is stored in the upper triangle of FACT.) Since L is the product of triangular matrices with unit
diagonals and of permutation matrices, det L = (−1)k where k is the number of pivoting interchanges.
LFDCG is based on the LINPACK routine CGEDI; see Dongarra et al. (1979).
LFDCG
Chapter 1: Linear Systems
178
Example
The determinant is computed for a complex general 3 × 3 matrix.
USE LFDCG_INT
USE LFTCG_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
DET2
A(LDA,LDA), FACT(LDFACT,LDFACT), DET1
Set values for A
A = (
(
(
3.0-2.0i 2.0+4.0i
1.0+1.0i 2.0-6.0i
4.0+0.0i -5.0+1.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
DATA A/(3.0,-2.0), (1.0,1.0), (4.0,0.0), (2.0,4.0), (2.0,-6.0),&
(-5.0,1.0), (0.0,-3.0), (1.0,2.0), (3.0,-2.0)/
!
!
Factor A
CALL LFTCG (A, FACT, IPVT)
!
!
!
Compute the determinant for the
factored matrix
CALL LFDCG (FACT, IPVT, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is’,3X,’(’,F6.3,’,’,F6.3,&
’) * 10**’,F2.0)
END
Output
The determinant of A is ( 0.700, 1.100) * 10**1.
LFDCG
Chapter 1: Linear Systems
179
LINCG
more...
more...
Computes the inverse of a complex general matrix.
Required Arguments
A — Complex N by N matrix containing the matrix to be inverted. (Input)
AINV — Complex N by N matrix containing the inverse of A. (Output)
If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDAINV — Leading dimension of AINV exactly as specified in the dimension statement of the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
CALL LINCG (A, AINV [, …])
Specific:
The specific interface names are S_LINCG and D_LINCG.
FORTRAN 77 Interface
Single:
CALL LINCG (N, A, LDA, AINV, LDAINV)
Double:
The double precision name is DLINCG.
ScaLAPACK Interface
Generic:
CALL LINCG (A0, AINV0 [, …])
Specific:
The specific interface names are S_LINCG and D_LINCG.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LINCG
Chapter 1: Linear Systems
180
Description
Routine LINCG computes the inverse of a complex general matrix. The underlying code is based on either
LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction
section of this manual.
LINCG first uses the routine LFCCG to compute an LU factorization of the coefficient matrix and to estimate
the condition number of the matrix. LFCCG computes U and the information needed to compute L. LINCT is
then used to compute U-1, i.e. use the inverse of U. Finally A-1 is computed using A-1 = U-1L-1.
LINCG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the iterative
refinement algorithm fails to converge. This errors occurs only if A is singular or very close to a singular
matrix.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in A-1.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2NCG/DL2NCG. The reference is:
CALL L2NCG (N, A, LDA, AINV, LDAINV, WK, IWK)
The additional arguments are as follows:
WK — Complex work vector of length N + N(N − 1)/2.
IWK — Integer work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The inverse might not be accurate.
4
2
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the matrix to be inverted. (Input)
AINV0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
AINV. AINV contains the inverse of the matrix A. (Output)
If A is not needed, A and AINV can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LINCG
Chapter 1: Linear Systems
181
Examples
Example
The inverse is computed for a complex general 3 × 3 matrix.
USE LINCG_INT
USE WRCRN_INT
USE CSSCAL_INT
!
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDAINV=3, N=3)
THIRD
A(LDA,LDA), AINV(LDAINV,LDAINV)
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
DATA A/(1.0,1.0), (2.0,1.0), (-2.0,1.0), (2.0,3.0), (5.0,3.0),&
(-4.0,4.0), (3.0,3.0), (7.0,4.0), (-5.0,3.0)/
!
!
!
!
Scale A by dividing by three
THIRD = 1.0/3.0
DO 10 I=1, N
CALL CSSCAL (N, THIRD, A(:,I), 1)
10 CONTINUE
Calculate the inverse of A
CALL LINCG (A, AINV)
Print results
CALL WRCRN (’AINV’, AINV)
END
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE
USE
USE
USE
MPI_SETUP_INT
LINCG_INT
WRCRN_INT
SCALAPACK_SUPPORT
LINCG
Chapter 1: Linear Systems
182
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA, NPROW, NPCOL
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
REAL
THIRD
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
Scale A by dividing by three
THIRD = 1.0/3.0
A = A * THIRD
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), AINV0(MXLDA,MXCOL))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
CALL LINCG (A0, AINV0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(AINV0, DESCA, AINV)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRCRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, AINV0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
AINV
LINCG
Chapter 1: Linear Systems
183
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LINCG
Chapter 1: Linear Systems
184
LSLRT
more...
more...
Solves a real triangular system of linear equations.
Required Arguments
A — N by N matrix containing the coefficient matrix for the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means solve AX = B, A lower triangular.
IPATH = 2 means solve AX = B, A upper triangular.
IPATH = 3 means solve ATX = B, A lower triangular.
IPATH = 4 means solve ATX = B, A upper triangular.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLRT (A, B, X [, …])
Specific:
The specific interface names are S_LSLRT and D_LSLRT.
FORTRAN 77 Interface
Single:
CALL LSLRT (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSLRT.
ScaLAPACK Interface
Generic:
CALL LSLRT (A0, B0, X0 [, …])
LSLRT
Chapter 1: Linear Systems
185
Specific:
The specific interface names are S_LSLRT and D_LSLRT.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LSLRT solves a system of linear algebraic equations with a real triangular coefficient matrix. LSLRT
fails if the matrix A has a zero diagonal element, in which case A is singular. The underlying code is based on
either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used
during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficients of the linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of three linear equations is solved. The coefficient matrix has lower triangular form and the righthand-side vector, b, has three elements.
USE LSLRT_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
!
!
!
!
!
!
Declare variables
(LDA=3)
A(LDA,LDA), B(LDA), X(LDA)
Set values for A and B
A = ( 2.0
( 2.0
( -4.0
B = (
2.0
-1.0
2.0
)
)
5.0)
5.0
0.0)
LSLRT
Chapter 1: Linear Systems
186
!
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
DATA B/2.0, 5.0, 0.0/
!
!
Solve AX = B
(IPATH = 1)
CALL LSLRT (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, 3, 1)
END
Output
X
1
1.000
2
-3.000
3
2.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has lower triangular form and the right-hand-side vector b has three elements. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLRT_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 2.0, 0.0, 0.0/)
A(2,:) = (/ 2.0, -1.0, 0.0/)
A(3,:) = (/-4.0, 2.0, 5.0/)
!
B =
ENDIF
!
!
!
!
(/ 2.0,
5.0,
0.0/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
LSLRT
Chapter 1: Linear Systems
187
!
!
!
!
!
!
!
!
!
!
!
!
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve AX = B
(IPATH = 1)
CALL LSLRT (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRRRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit Scalapack usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
1.000
2
-3.000
3
2.000
LSLRT
Chapter 1: Linear Systems
188
LFCRT
more...
more...
Estimates the condition number of a real triangular matrix.
Required Arguments
A — N by N matrix containing the coefficient matrix for the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means A is lower triangular. IPATH = 2 means A is upper triangular.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
CALL LFCRT (A, RCOND [, …])
Specific:
The specific interface names are S_LFCRT and D_LFCRT.
FORTRAN 77 Interface
Single:
CALL LFCRT (N, A, LDA, IPATH, RCOND)
Double:
The double precision name is DLFCRT.
ScaLAPACK Interface
Generic:
CALL LFCRT (A0, RCOND [, …])
Specific:
The specific interface names are S_LFCRT and D_LFCRT.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFCRT
Chapter 1: Linear Systems
189
Description
Routine LFCRT estimates the condition number of a real triangular matrix. The L1 condition number of the
matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute ∥A-1∥1, the condition number is
only estimated. The estimation algorithm is the same as used by LINPACK and is described by Cline et al.
(1979).
If the estimated condition number is greater than 1/Ɛ (where Ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CRT/ DL2CRT. The reference is:
CALL L2CRT (N, A, LDA, IPATH, RCOND, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational error
Type
Code
Description
3
1
The input triangular matrix is algorithmically singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficient matrix for the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
An estimate of the reciprocal condition number is computed for a 3 × 3 lower triangular coefficient matrix.
USE LFCRT_INT
USE UMACH_INT
!
Declare variables
LFCRT
Chapter 1: Linear Systems
190
PARAMETER
REAL
INTEGER
(LDA=3)
A(LDA,LDA), RCOND
NOUT
!
!
!
!
!
Set values for A and B
A = ( 2.0
)
( 2.0
-1.0
)
( -4.0
2.0
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
!
Compute the reciprocal condition
number (IPATH=1)
CALL LFCRT (A, RCOND)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.1
L1 Condition number < 15.0
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing example. An
estimate of the reciprocal condition number is computed for the 3 × 3 lower triangular coefficient matrix.
SCALAPACK_MAP is an IMSL utility routine (see Chapter 11, “Utilities”) used to map an array to the processor
grid. It is used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for
the local arrays.
USE MPI_SETUP_INT
USE LFCRT_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, NOUT, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
REAL, ALLOCATABLE ::
A(:,:)
REAL, ALLOCATABLE ::
A0(:,:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N))
Set values for A
A(1,:) = (/ 2.0, 0.0, 0.0/)
A(2,:) = (/ 2.0, -1.0, 0.0/)
A(3,:) = (/-4.0, 2.0, 5.0/)
ENDIF
LFCRT
Chapter 1: Linear Systems
191
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
!
Get the array descriptor entities MXLDA,
!
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
!
Set up the array descriptor
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL))
!
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Compute the reciprocal condition
!
number (IPATH=1)
CALL LFCRT (A0, RCOND)
!
Print results.
!
Only Rank=0 has the solution, RCOND.
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
ENDIF
IF (MP_RANK .EQ. 0) DEALLOCATE(A)
DEALLOCATE(A0)
!
Exit Scalapack usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.1
L1 Condition number < 15.0
LFCRT
Chapter 1: Linear Systems
192
LFDRT
Computes the determinant of a real triangular matrix.
Required Arguments
A — N by N matrix containing the triangular matrix. (Input)
The matrix can be either upper or lower triangular.
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ |DET1| < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LFDRT (A, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDRT and D_LFDRT.
FORTRAN 77 Interface
Single:
CALL LFDRT (N, A, LDA, DET1, DET2)
Double:
The double precision name is DLFDRT.
Description
Routine LFDRT computes the determinant of a real triangular coefficient matrix. The determinant of a triangular matrix is the product of the diagonal elements
GHW$
š
1
L $LL
LFDRT is based on the LINPACK routine STRDI; see Dongarra et al. (1979).
LFDRT
Chapter 1: Linear Systems
193
Comments
Informational error
Type
Code
Description
3
1
The input triangular matrix is singular.
Example
The determinant is computed for a 3 × 3 lower triangular matrix.
USE LFDRT_INT
USE UMACH_INT
!
PARAMETER
REAL
INTEGER
!
!
!
!
!
Declare variables
(LDA=3)
A(LDA,LDA), DET1, DET2
NOUT
Set values for A
A = ( 2.0
( 2.0
-1.0
( -4.0
2.0
)
)
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
Compute the determinant of A
CALL LFDRT (A, DET1, DET2)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT (’ The determinant of A is ’, F6.3, ’ * 10**’, F2.0)
END
Output
The determinant of A is -1.000 * 10**1.
LFDRT
Chapter 1: Linear Systems
194
LINRT
Computes the inverse of a real triangular matrix.
Required Arguments
A — N by N matrix containing the triangular matrix to be inverted. (Input)
For a lower triangular matrix, only the lower triangular part and diagonal of A are referenced. For an
upper triangular matrix, only the upper triangular part and diagonal of A are referenced.
AINV — N by N matrix containing the inverse of A. (Output)
If A is lower triangular, AINV is also lower triangular. If A is upper triangular, AINV is also upper triangular. If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means A is lower triangular. IPATH = 2 means A is upper triangular.
Default: IPATH = 1.
LDAINV — Leading dimension of AINV exactly as specified in the dimension statement of the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
CALL LINRT (A, AINV [, …])
Specific:
The specific interface names are S_LINRT and D_LINRT.
FORTRAN 77 Interface
Single:
CALL LINRT (N, A, LDA, IPATH, AINV, LDAINV)
Double:
The double precision name is DLINRT.
Description
Routine LINRT computes the inverse of a real triangular matrix. It fails if A has a zero diagonal element.
Example
The inverse is computed for a 3 × 3 lower triangular matrix.
USE LINRT_INT
USE WRRRN_INT
LINRT
Chapter 1: Linear Systems
195
!
Declare variables
(LDA=3)
A(LDA,LDA), AINV(LDA,LDA)
Set values for A
A = ( 2.0
( 2.0
-1.0
( -4.0
2.0
PARAMETER
REAL
!
!
!
!
!
)
)
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
Compute the inverse of A
CALL LINRT (A, AINV)
!
Print results
CALL WRRRN (’AINV’, AINV)
END
Output
AINV
1
2
3
1
0.500
1.000
0.000
2
0.000
-1.000
0.400
3
0.000
0.000
0.200
LINRT
Chapter 1: Linear Systems
196
LSLCT
more...
more...
Solves a complex triangular system of linear equations.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the triangular linear system. (Input)
For a lower triangular system, only the lower triangle of A is referenced. For an upper triangular system, only the upper triangle of A is referenced.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means solve AX = B, A lower triangular
IPATH = 2 means solve AX = B, A upper triangular
IPATH = 3 means solve AHX = B, A lower triangular
IPATH = 4 means solve AHX = B, A upper triangular
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLCT (A, B, X [, …])
Specific:
The specific interface names are S_LSLCT and D_LSLCT.
FORTRAN 77 Interface
Single:
CALL LSLCT (N, A, LDA, B, IPATH, X)
Double:
The double precision name is DLSLCT.
LSLCT
Chapter 1: Linear Systems
197
ScaLAPACK Interface
Generic:
CALL LSLCT (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSLCT and D_LSLCT.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LSLCT solves a system of linear algebraic equations with a complex triangular coefficient matrix.
LSLCT fails if the matrix A has a zero diagonal element, in which case A is singular. The underlying code is
based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are
used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in
the Introduction section of this manual.
Comments
Informational error
Type
Code
Description
4
1
The input triangular matrix is singular. Some of its diagonal elements are
near zero.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
B0 — Local complex vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Local complex vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of three linear equations is solved. The coefficient matrix has lower triangular form and the righthand-side vector, b, has three elements.
LSLCT
Chapter 1: Linear Systems
198
USE LSLCT_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
Declare variables
LDA
(LDA=3)
A(LDA,LDA), B(LDA), X(LDA)
Set values for A and B
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
B = (-13.0+0.0i -10.0-1.0i -11.0+3.0i)
DATA A/(-3.0,2.0), (-2.0,-1.0), (-1.0, 3.0), (0.0,0.0), (0.0,6.0),&
(1.0,-5.0), (0.0,0.0), (0.0,0.0), (-4.0,0.0)/
DATA B/(-13.0,0.0), (-10.0,-1.0), (-11.0,3.0)/
!
!
Solve AX = B
CALL LSLCT (A, B, X)
!
Print results
CALL WRCRN (’X’, X, 1, 3, 1)
END
Output
X
1
( 3.000, 2.000)
2
( 1.000, 1.000)
3
( 2.000, 0.000)
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing example.
The system of three linear equations is solved. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (see Chapter 11, “Utilities”) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LSLCT_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
LSLCT
Chapter 1: Linear Systems
199
!
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for
A(1,:) = (/ (-3.0, 2.0), (0.0, 0.0),
A(2,:) = (/ (-2.0, -1.0), (0.0, 6.0),
A(3,:) = (/ (-1.0, 3.0), (1.0, -5.0),
A
( 0.0, 0.0)/)
( 0.0, 0.0)/)
(-4.0, 0.0)/)
!
B
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
= (/ (-13.0, 0.0), (-10.0, -1.0), (-11.0, 3.0)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptor
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve AX = B
CALL LSLCT (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0) CALL WRCRN (‘X’, X, 1, 3, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
( 3.000, 2.000)
2
( 1.000, 1.000)
3
( 2.000, 0.000)
LSLCT
Chapter 1: Linear Systems
200
LFCCT
more...
more...
Estimates the condition number of a complex triangular matrix.
Required Arguments
A — Complex N by N matrix containing the triangular matrix. (Input)
For a lower triangular system, only the lower triangle of A is referenced. For an upper triangular system, only the upper triangle of A is referenced.
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means A is lower triangular.
IPATH = 2 means A is upper triangular.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
CALL LFCCT (A, RCOND [,…])
Specific:
The specific interface names are S_LFCCT and D_LFCCT.
FORTRAN 77 Interface
Single:
CALL LFCCT (N, A, LDA, IPATH, RCOND)
Double:
The double precision name is DLFCCT.
ScaLAPACK Interface
Generic:
CALL LFCCT (A0, RCOND [,…])
Specific:
The specific interface names are S_LFCCT and D_LFCCT.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFCCT
Chapter 1: Linear Systems
201
Description
Routine LFCCT estimates the condition number of a complex triangular matrix. The L1condition number of
the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute ∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK and is described by Cline et
al. (1979). If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the solution x. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CCT/DL2CCT. The reference is:
CALL L2CCT (N, A, LDA, IPATH, RCOND, CWK)
The additional argument is:
CWK — Complex work vector of length N.
2.
Informational error
Type
Code
Description
3
1
The input triangular matrix is algorithmically singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are referenced. For an
upper triangular system, only the upper triangular part and diagonal of A are referenced.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
An estimate of the reciprocal condition number is computed for a 3 × 3 lower triangular coefficient matrix.
USE LFCCT_INT
USE UMACH_INT
!
Declare variables
INTEGER
PARAMETER
INTEGER
LDA, N
(LDA=3)
NOUT
LFCCT
Chapter 1: Linear Systems
202
REAL
COMPLEX
RCOND
A(LDA,LDA)
!
!
!
!
!
!
Set values for A
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
DATA A/(-3.0,2.0), (-2.0,-1.0), (-1.0, 3.0), (0.0,0.0), (0.0,6.0),&
(1.0,-5.0), (0.0,0.0), (0.0,0.0), (-4.0,0.0)/
!
!
!
Compute the reciprocal condition
number
CALL LFCCT (A, RCOND)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.2
L1 Condition number < 10.0
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing example. An
estimate of the reciprocal condition number is computed for a 3 × 3 lower triangular coefficient matrix.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCCT_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, NOUT, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N))
Set values for A
A(1,:) = (/ (-3.0, 2.0), (0.0, 0.0), ( 0.0, 0.0)/)
A(2,:) = (/ (-2.0, -1.0), (0.0, 6.0), ( 0.0, 0.0)/)
LFCCT
Chapter 1: Linear Systems
203
A(3,:) = (/ (-1.0,
ENDIF
3.0), (1.0, -5.0), (-4.0, 0.0)/)
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
!
Get the array descriptor entities MXLDA,
!
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
!
Set up the array descriptor
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL))
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Compute the reciprocal condition
!
number
CALL LFCCT (A0, RCOND)
!
Print results.
!
Only Rank=0 has the solution, RCOND.
IF (MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
ENDIF
IF (MP_RANK .EQ. 0) DEALLOCATE(A)
DEALLOCATE(A0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.2
L1 Condition number < 10.0
LFCCT
Chapter 1: Linear Systems
204
LFDCT
Computes the determinant of a complex triangular matrix.
Required Arguments
A — Complex N by N matrix containing the triangular matrix.(Input)
DET1 — Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LFDCT (A, DET1, DET2 [,…])
Specific:
The specific interface names are S_LFDCT and D_LFDCT.
FORTRAN 77 Interface
Single:
CALL LFDCT (N, A, LDA, DET1, DET2)
Double:
The double precision name is DLFDCT.
Description
Routine LFDCT computes the determinant of a complex triangular coefficient matrix. The determinant of a
triangular matrix is the product of the diagonal elements
GHW$
š
1
L $LL
LFDCT is based on the LINPACK routine CTRDI; see Dongarra et al. (1979).
Comments
Informational error
Type
Code
Description
3
1
The input triangular matrix is singular.
LFDCT
Chapter 1: Linear Systems
205
Example
The determinant is computed for a 3 × 3 complex lower triangular matrix.
USE LFDCT_INT
USE UMACH_INT
!
Declare variables
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
LDA, N
(LDA=3, N=3)
NOUT
DET2
A(LDA,LDA), DET1
!
!
!
!
!
!
Set values for A
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
DATA A/(-3.0,2.0), (-2.0,-1.0), (-1.0, 3.0), (0.0,0.0), (0.0,6.0),&
(1.0,-5.0), (0.0,0.0), (0.0,0.0), (-4.0,0.0)/
!
!
Compute the determinant of A
CALL LFDCT (A, DET1, DET2)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT (’ The determinant of A is (’,F4.1,’,’,F4.1,’) * 10**’,&
F2.0)
END
Output
The determinant of A is ( 0.5, 0.7) * 10**2.
LFDCT
Chapter 1: Linear Systems
206
LINCT
Computes the inverse of a complex triangular matrixs.
Required Arguments
A — Complex N by N matrix containing the triangular matrix to be inverted. (Input)
For a lower triangular matrix, only the lower triangle of A is referenced. For an upper triangular
matrix, only the upper triangle of A is referenced.
AINV — Complex N by N matrix containing the inverse of A. (Output)
If A is lower triangular, AINV is also lower triangular. If A is upper triangular, AINV is also upper triangular. If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means A is lower triangular.
IPATH = 2 means A is upper triangular.
Default: IPATH = 1.
LDAINV — Leading dimension of AINV exactly as specified in the dimension statement of the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
CALL LINCT (A, AINV [,…])
Specific:
The specific interface names are S_LINCT and D_LINCT.
FORTRAN 77 Interface
Single:
CALL LINCT (N, A, LDA, IPATH, AINV, LDAINV)
Double:
The double precision name is DLINCT.
Description
Routine LINCT computes the inverse of a complex triangular matrix. It fails if A has a zero diagonal element.
LINCT
Chapter 1: Linear Systems
207
Comments
Informational error
Type
Code
Description
4
1
The input triangular matrix is singular. Some of its diagonal elements are
close to zero.
Example
The inverse is computed for a 3 × 3 lower triangular matrix.
USE LINCT_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
Declare variables
LDA
(LDA=3)
A(LDA,LDA), AINV(LDA,LDA)
Set values for A
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
DATA A/(-3.0,2.0), (-2.0,-1.0), (-1.0, 3.0), (0.0,0.0), (0.0,6.0),&
(1.0,-5.0), (0.0,0.0), (0.0,0.0), (-4.0,0.0)/
!
!
Compute the inverse of A
CALL LINCT (A, AINV)
!
Print results
CALL WRCRN (’AINV’, AINV)
END
Output
AINV
1
2
3
1
(-0.2308,-0.1538)
(-0.0897, 0.0513)
( 0.2147,-0.0096)
2
( 0.0000, 0.0000)
( 0.0000,-0.1667)
(-0.2083,-0.0417)
3
( 0.0000, 0.0000)
( 0.0000, 0.0000)
(-0.2500, 0.0000)
LINCT
Chapter 1: Linear Systems
208
LSADS
more...
more...
Solves a real symmetric positive definite system of linear equations with iterative refinement.
Required Arguments
A — N by N matrix containing the coefficient matrix of the symmetric positive definite linear system.
(Input)
Only the upper triangle of A is referenced.
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSADS (A, B, X [,…])
Specific:
The specific interface names are S_LSADS and D_LSADS.
FORTRAN 77 Interface
Single:
CALL LSADS (N, A, LDA, B, X)
Double:
The double precision name is DLSADS.
ScaLAPACK Interface
Generic:
CALL LSADS (A0, B0, X0 [,…])
Specific:
The specific interface names are S_LSADS and D_LSADS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSADS
Chapter 1: Linear Systems
209
Description
Routine LSADS solves a system of linear algebraic equations having a real symmetric positive definite coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. LSADS first uses the routine
LFCDS to compute an RTR Cholesky factorization of the coefficient matrix and to estimate the condition
number of the matrix. The matrix R is upper triangular. The solution of the linear system is then found using
the iterative refinement routine LFIDS. LSADS fails if any submatrix of R is not positive definite, if R has a
zero diagonal element or if the iterative refinement algorithm fails to converge. These errors occur only if A is
either very close to a singular matrix or a matrix which is not positive definite. If the estimated condition
number is greater than 1/ɛ (where ɛ is machine precision), a warning error is issued. This indicates that very
small changes in A can cause very large changes in the solution x. Iterative refinement can sometimes find
the solution to such a system. LSADS solves the problem that is represented in the computer; however, this
problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ADS/DL2ADS. The reference is:
CALL L2ADS (N, A, LDA, B, X, FACT, WK)
The additional arguments are as follows:
FACT— Work vector of length N2 containing the RTR factorization of A on output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is not positive definite.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ADS the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSADS. Additional memory allocation for FACT and option value restoration are
done automatically in LSADS. Users directly calling L2ADS can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSADS or L2ADS.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine LSADS temporarily replaces IVAL(2) by IVAL(1). The routine L2CDS computes the
condition number if IVAL(2) = 2. Otherwise L2CDS skips this computation. LSADS restores the
option. Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
LSADS
Chapter 1: Linear Systems
210
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficient matrix of the symmetric positive definite linear system. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of three linear equations is solved. The coefficient matrix has real positive definite form and the
right-hand-side vector b has three elements.
USE LSADS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
Set values for A and B
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( 27.0 -78.0
64.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/27.0, -78.0, 64.0/
!
CALL LSADS (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
!
END
Output
X
1
1.000
2
-4.000
3
7.000
LSADS
Chapter 1: Linear Systems
211
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has real positive definite form and the right-hand-side vector b has three elements. SCALAPACK_MAP
and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays
to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine
which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSADS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B = (/27.0, -78.0,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
64.0/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSADS (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRRRN (’X’, X, 1, N, 1)
LSADS
Chapter 1: Linear Systems
212
!
!
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
1.000
2
-4.000
3
7.000
LSADS
Chapter 1: Linear Systems
213
LSLDS
more...
more...
Solves a real symmetric positive definite system of linear equations without iterative refinement .
Required Arguments
A — N by N matrix containing the coefficient matrix of the symmetric positive definite linear system.
(Input)
Only the upper triangle of A is referenced.
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLDS (A, B, X [, …])
Specific:
The specific interface names are S_LSLDS and D_LSLDS.
FORTRAN 77 Interface
Single:
CALL LSLDS (N, A, LDA, B, X)
Double:
The double precision name is DLSLDS.
ScaLAPACK Interface
Generic:
CALL LSLDS (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSLDS and D_LSLDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSLDS
Chapter 1: Linear Systems
214
Description
Routine LSLDS solves a system of linear algebraic equations having a real symmetric positive definite coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. LSLDS first uses the routine
LFCDS to compute an RTR Cholesky factorization of the coefficient matrix and to estimate the condition
number of the matrix. The matrix R is upper triangular. The solution of the linear system is then found using
the routine LFSDS. LSLDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors occur only if A either is very close to a singular matrix or to a matrix which is not positive
definite. If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the solution x. If
the coefficient matrix is ill-conditioned, it is recommended that LSADS be used.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LDS/DL2LDS. The reference is:
CALL L2LDS (N, A, LDA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — N × N work array containing the RTR factorization of A on output. If A is not needed, A can
share the same storage locations as FACT.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is not positive definite.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LDS the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLDS. Additional memory allocation for FACT and option value restoration are
done automatically in LSLDS. Users directly calling L2LDS can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLDS or L2LDS.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLDS temporarily replaces IVAL(2) by IVAL(1). The routine L2CDS computes the condition
number if IVAL(2) = 2. Otherwise L2CDS skips this computation. LSLDS restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
LSLDS
Chapter 1: Linear Systems
215
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficient matrix of the symmetric positive definite linear system. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of three linear equations is solved. The coefficient matrix has real positive definite form and the
right-hand-side vector b has three elements.
USE LSLDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
Set values for A and B
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( 27.0 -78.0
64.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/27.0, -78.0, 64.0/
!
CALL LSLDS (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
!
END
Output
X
1
1.000
2
-4.000
3
7.000
LSLDS
Chapter 1: Linear Systems
216
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The coefficient
matrix has real positive definite form and the right-hand-side vector b has three elements. SCALAPACK_MAP
and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays
to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine
which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B = (/27.0, -78.0,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
64.0/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSLDS (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
LSLDS
Chapter 1: Linear Systems
217
!
!
IF(MP_RANK .EQ. 0)CALL WRRRN (’X’, X, 1, N, 1)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
1.000
2
-4.000
3
7.000
LSLDS
Chapter 1: Linear Systems
218
LFCDS
more...
more...
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix and estimate its L1
condition number.
Required Arguments
A — N by N symmetric positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing the upper triangular matrix R of the factorization of A in the upper triangular part. (Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCDS (A, FACT, RCOND [, …])
Specific:
The specific interface names are S_LFCDS and D_LFCDS.
FORTRAN 77 Interface
Single:
CALL LFCDS (N, A, LDA, FACT, LDFACT, RCOND)
Double:
The double precision name is DLFCDS.
LFCDS
Chapter 1: Linear Systems
219
ScaLAPACK Interface
Generic:
CALL LFCDS (A0, FACT0, RCOND [, …])
Specific:
The specific interface names are S_LFCDS and D_LFCDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFCDS computes an RTR Cholesky factorization and estimates the condition number of a real symmetric positive definite coefficient matrix. The matrix R is upper triangular.
The L1condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RTR factors are returned in a form that is compatible with routines LFIDS, LFSDS and LFDDS. To solve
systems of equations with multiple right-hand-side vectors, use LFCDS followed by either LFIDS or LFSDS
called once for each right-hand side. The routine LFDDS can be called to compute the determinant of the coefficient matrix after LFCDS has performed the factorization.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CDS/DL2CDS. The reference is:
CALL L2CDS (N, A, LDA, FACT, LDFACT, RCOND, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is not positive definite.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the symmetric positive definite matrix to be factored. (Input)
LFCDS
Chapter 1: Linear Systems
220
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the upper triangular matrix R of the factorization of A in the upper triangular part.
(Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
The inverse of a 3 × 3 matrix is computed. LFCDS is called to factor the matrix and to check for nonpositive
definiteness or ill-conditioning. LFIDS is called to determine the columns of the inverse.
USE
USE
USE
USE
LFCDS_INT
UMACH_INT
WRRRN_INT
LFIDS_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NOUT
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), AINV(LDA,LDA), RCOND, FACT(LDFACT,LDFACT),&
RES(N), RJ(N)
Set values for A
A = ( 1.0 -3.0
( -3.0 10.0
( 2.0 -5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix A
CALL LFCDS (A, FACT, RCOND)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFIDS
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIDS (A, FACT, RJ, AINV(:,J), RES)
RJ(J) = 0.0E0
10 CONTINUE
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
LFCDS
Chapter 1: Linear Systems
221
CALL WRRRN (’AINV’, AINV)
99999 FORMAT (’ RCOND = ’,F5.3,/,’
END
L1 Condition number = ’,F9.3)
Output
RCOND < 0.005
L1 Condition number < 875.0
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. LFCDS is called to factor the
matrix and to check for singularity or ill-conditioning. LFIDS is called to determine the columns of the
inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used
to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCDS_INT
USE UMACH_INT
USE LFIDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
ENDIF
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
LFCDS
Chapter 1: Linear Systems
222
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA))
!
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Call the factorization routine
CALL LFCDS (A0, FACT0, RCOND)
!
Print the reciprocal condition number
!
and the L1 condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
ENDIF
!
Set up the columns of the identity
!
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
!
Map input array to the processor grid
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
!
RJ is the J-th column of the identity
!
matrix so the following LFIDS
!
reference computes the J-th column of
!
the inverse of A
CALL LFIDS (A0, FACT0, RJ0, X0, RES0)
RJ(J) = 0.0
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
!
Print results.
!
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, RJ, RJ0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F9.3)
END
!
Output
RCOND < 0.005
L1 Condition number < 875.0
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LFCDS
Chapter 1: Linear Systems
223
LFTDS
more...
more...
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix.
Required Arguments
A — N by N symmetric positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing the upper triangular matrix R of the factorization of A in the upper triangle, and the lower triangular matrix RT in the lower triangle. (Output)
If A is not needed, A and FACT can share the same storage location.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTDS (A, FACT [, …])
Specific:
The specific interface names are S_LFTDS and D_LFTDS.
FORTRAN 77 Interface
Single:
CALL LFTDS (N, A, LDA, FACT, LDFACT)
Double:
The double precision name is DLFTDS.
ScaLAPACK Interface
Generic:
CALL LFTDS (A0, FACT0 [, …])
Specific:
The specific interface names are S_LFTDS and D_LFTDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFTDS
Chapter 1: Linear Systems
224
Description
Routine LFTDS computes an RTR Cholesky factorization of a real symmetric positive definite coefficient
matrix. The matrix R is upper triangular.
LFTDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RTR factors are returned in a form that is compatible with routines LFIDS, LFSDS and LFDDS. To solve
systems of equations with multiple right-hand-side vectors, use LFTDS followed by either LFIDS or LFSDS
called once for each right-hand side. The routine LFDDS can be called to compute the determinant of the coefficient matrix after LFTDS has performed the factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
Code
Description
4
2
The input matrix is not positive definite.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the symmetric positive definite matrix to be factored. (Input)
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the upper triangular matrix R of the factorization of A in the upper triangular part.
(Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
The inverse of a 3 × 3 matrix is computed. LFTDS is called to factor the matrix and to check for nonpositive
definiteness. LFSDS is called to determine the columns of the inverse.
USE LFTDS_INT
LFTDS
Chapter 1: Linear Systems
225
USE LFSDS_INT
USE WRRRN_INT
!
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A
A = ( 1.0 -3.0
( -3.0 10.0
( 2.0 -5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix A
CALL LFTDS (A, FACT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSDS
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSDS (FACT, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
Print the results
CALL WRRRN (’AINV’, AINV)
!
END
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. LFTDS is called to factor the
matrix and to check for nonpositive definiteness. LFSDS is called to determine the columns of the inverse.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE
USE
USE
USE
MPI_SETUP_INT
LFTDS_INT
UMACH_INT
LFSDS_INT
LFTDS
Chapter 1: Linear Systems
226
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA), IPVT0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the factorization routine
CALL LFTDS (A0, FACT0)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
RJ is the J-th column of the identity
matrix so the following LFSDS
reference computes the J-th column of
the inverse of A
CALL LFSDS (FACT0, RJ0, X0)
RJ(J) = 0.0
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
Print results.
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, IPVT0, RJ, RJ0, RES0, X0)
LFTDS
Chapter 1: Linear Systems
227
!
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
RCOND < 0.005
L1 Condition number < 875.0
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LFTDS
Chapter 1: Linear Systems
228
LFSDS
more...
more...
Solves a real symmetric positive definite system of linear equations given the RT R Cholesky factorization of
the coefficient matrix.
Required Arguments
FACT — N by N matrix containing the RT R factorization of the coefficient matrix A as output from routine
LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSDS (FACT, B, X [, …])
Specific:
The specific interface names are S_LFSDS and D_LFSDS.
FORTRAN 77 Interface
Single:
CALL LFSDS (N, FACT, LDFACT, B, X)
Double:
The double precision name is DLFSDS.
ScaLAPACK Interface
Generic:
CALL LFSDS (FACT0, B0, X0 [, …])
Specific:
The specific interface names are S_LFSDS and D_LFSDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFSDS
Chapter 1: Linear Systems
229
Description
Routine LFSDS computes the solution for a system of linear algebraic equations having a real symmetric positive definite coefficient matrix. To compute the solution, the coefficient matrix must first undergo an RTR
factorization. This may be done by calling either LFCDS or LFTDS. R is an upper triangular matrix.
The solution to Ax = b is found by solving the triangular systems RTy = b and Rx = y.
LFSDS and LFIDS both solve a linear system given its RTR factorization. LFIDS generally takes more time
and produces a more accurate answer than LFSDS. Each iteration of the iterative refinement algorithm used
by LFIDS calls LFSDS.
The underlying code is based on either LINPACK, LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
Code
Description
4
1
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the RT R factorization of the coefficient matrix A as output from routine
LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LFSDS
Chapter 1: Linear Systems
230
Examples
Example
A set of linear systems is solved successively. LFTDS is called to factor the coefficient matrix. LFSDS is called
to compute the four solutions for the four right-hand sides. In this case the coefficient matrix is assumed to be
well-conditioned and correctly scaled. Otherwise, it would be better to call LFCDS to perform the factorization, and LFIDS to compute the solutions.
USE LFSDS_INT
USE LFTDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), B(N,4), FACT(LDFACT,LDFACT), X(N,4)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( -1.0
( -3.0
( -3.0
3.6
-4.2
-5.2
-8.0 -9.4)
11.0 17.6)
-6.0 -23.4)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/-1.0, -3.0, -3.0, 3.6, -4.2, -5.2, -8.0, 11.0, -6.0,&
-9.4, 17.6, -23.4/
Factor the matrix A
CALL LFTDS (A, FACT)
Compute the solutions
DO 10 I=1, 4
CALL LFSDS (FACT, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRRRN (’The solution vectors are’, X)
!
END
Output
1
2
3
The solution vectors are
1
2
3
4
-44.0
118.4 -162.0
-71.2
-11.0
25.6
-36.0
-16.6
5.0
-19.0
23.0
6.0
LFSDS
Chapter 1: Linear Systems
231
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. Routine LFTDS is called to factor the coefficient matrix. The routine LFSDS is called to compute the four solutions for the four right-hand
sides. In this case, the coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it
would be better to call LFCDS to perform the factorization, and LFIDS to compute the solutions.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFSDS_INT
USE LFTDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N,4), X(N,4))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
!
!
!
!
!
!
!
!
!
!
B(1,:) = (/ -1.0, 3.6, -8.0, -9.4/)
B(2,:) = (/ -3.0, -4.2, 11.0, 17.6/)
B(3,:) = (/ -3.0, -5.2, -6.0, -23.4/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), B0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the factorization routine
CALL LFTDS (A0, FACT0)
Set up the columns of the B
matrix one at a time in X0
DO 10 J=1, 4
LFSDS
Chapter 1: Linear Systems
232
!
!
!
!
!
CALL SCALAPACK_MAP(B(:,j), DESCL, B0)
Solve for the J-th column of X
CALL LFSDS (FACT0, B0, X0)
CALL SCALAPACK_UNMAP(X0, DESCL, X(:,J))
10 CONTINUE
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRRRN (’The solution vectors are’, X)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, FACT0, B0, X0)
Exit Scalapack usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
1
2
3
The solution vectors are
1
2
3
4
-44.0
118.4 -162.0
-71.2
-11.0
25.6
-36.0
-16.6
5.0
-19.0
23.0
6.0
LFSDS
Chapter 1: Linear Systems
233
LFIDS
more...
more...
Uses iterative refinement to improve the solution of a real symmetric positive definite system of linear
equations.
Required Arguments
A — N by N matrix containing the symmetric positive definite coefficient matrix of the linear system.
(Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing the RT R factorization of the coefficient matrix A as output from routine
LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES — Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimesion statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFIDS (A, FACT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIDS and D_LFIDS.
FORTRAN 77 Interface
Single:
CALL LFIDS (N, A, LDA, FACT, LDFACT, B, X, RES)
Double:
The double precision name is DLFIDS.
LFIDS
Chapter 1: Linear Systems
234
ScaLAPACK Interface
Generic:
CALL LFIDS (A0, FACT0, B0, X0, RES0 [, …])
Specific:
The specific interface names are S_LFIDS and D_LFIDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFIDS computes the solution of a system of linear algebraic equations having a real symmetric positive definite coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon
which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
To compute the solution, the coefficient matrix must first undergo an RTR factorization. This may be done by
calling either LFCDS or LFTDS.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIDS and LFSDS both solve a linear system given its RTR factorization. LFIDS generally takes more time
and produces a more accurate answer than LFSDS. Each iteration of the iterative refinement algorithm used
by LFIDS calls LFSDS.
Comments
Informational error
Type
Code
Description
3
2
The input matrix is too ill-conditioned for iterative refinement to be effective.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the symmetric positive definite coefficient matrix of the linear system. (Input)
FACT0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix FACT.
FACT contains the RT R factorization of the coefficient matrix A as output from routine
LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXLDA containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES0 — Local vector of length MXLDA containing the local portions of the distributed vector RES. RES
contains the residual vector at the improved solution to the linear system. (Output)
LFIDS
Chapter 1: Linear Systems
235
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.2 to the second element.
USE
USE
USE
USE
LFIDS_INT
LFCDS_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), B(N), RCOND, FACT(LDFACT,LDFACT), RES(N,3),&
X(N,3)
Set values for A and B
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = (
-3.0
2.0)
1.0
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/1.0, -3.0, 2.0/
Factor the matrix A
CALL LFCDS (A, FACT, RCOND)
Print the estimated condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Compute the solutions
DO 10 I=1, 3
CALL LFIDS (A, FACT, B, X(:,I), RES(:,I))
B(2) = B(2) + .2E0
10 CONTINUE
Print solutions and residuals
CALL WRRRN (’The solution vectors are’, X)
CALL WRRRN (’The residual vectors are’, RES)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F9.3)
Output
RCOND = 0.001
LFIDS
Chapter 1: Linear Systems
236
L1 Condition number =
674.727
The solution vectors are
1
2
3
1
1.000
2.600
4.200
2
0.000
0.400
0.800
3
0.000 -0.200 -0.400
The residual
1
1
0.0000
2
0.0000
3
0.0000
vectors are
2
3
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-hand-side vector is
perturbed after solving the system each of the first two times by adding 0.2 to the second element.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map
and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFIDS_INT
USE LFCDS_INT
USE UMACH_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:,:), RES(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:), RES0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N,3), RES(N,3))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/-3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B
ENDIF
!
!
!
!
= (/ 1.0,
-3.0,
2.0/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
LFIDS
Chapter 1: Linear Systems
237
!
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
!
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA), FACT0(MXLDA,MXCOL), B0(MXLDA), &
RES0(MXLDA))
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
!
Call the factorization routine
CALL LFCDS (A0, FACT0, RCOND)
!
Print the estimated condition number
CALL UMACH (2, NOUT)
IF(MP_RANK .EQ. 0) WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
!
Set up the columns of the B
!
matrix one at a time in X0
DO 10 J=1, 3
CALL SCALAPACK_MAP(B, DESCL, B0)
!
Solve for the J-th column of X
CALL LFIDS (A0, FACT0, B0, X0, RES0)
CALL SCALAPACK_UNMAP(X0, DESCL, X(:,J))
CALL SCALAPACK_UNMAP(RES0, DESCL, RES(:,J))
IF(MP_RANK .EQ. 0) B(2) = B(2) + .2E0
10 CONTINUE
!
Print results.
!
Only Rank=0 has the full arrays
IF(MP_RANK.EQ.0) CALL WRRRN (’The solution vectors are’, X)
IF(MP_RANK.EQ.0) CALL WRRRN (’The residual vectors are’, RES)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X, RES)
DEALLOCATE(A0, B0, FACT0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F9.3)
END
Output
RCOND = 0.001
L1 Condition number =
674.727
The solution vectors are
1
2
3
1
1.000
2.600
4.200
2
0.000
0.400
0.800
3
0.000 -0.200 -0.400
The residual
1
1
0.0000
2
0.0000
3
0.0000
vectors are
2
3
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
LFIDS
Chapter 1: Linear Systems
238
LFDDS
Computes the determinant of a real symmetric positive definite matrix given the RTR Cholesky factorization
of the matrix .
Required Arguments
FACT — N by N matrix containing the RT R factorization of the coefficient matrix A as output from routine
LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that, 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form, det(A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDDS (FACT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDDS and D_LFDDS.
FORTRAN 77 Interface
Single:
CALL LFDDS (N, FACT, LDFACT, DET1, DET2)
Double:
The double precision name is DLFDDS.
Description
Routine LFDDS computes the determinant of a real symmetric positive definite coefficient matrix. To compute the determinant, the coefficient matrix must first undergo an RTR factorization. This may be done by
calling either LFCDS or LFTDS. The formula det A = det RT det R = (det R)2 is used to compute the determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
GHW5
š
1
L 5LL
(The matrix R is stored in the upper triangle of FACT.)
LFDDS is based on the LINPACK routine SPODI; see Dongarra et al. (1979).
LFDDS
Chapter 1: Linear Systems
239
Example
The determinant is computed for a real positive definite 3 × 3 matrix.
USE LFDDS_INT
USE LFTDS_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, NOUT
(LDA=3, LDFACT=3)
A(LDA,LDA), DET1, DET2, FACT(LDFACT,LDFACT)
Set values for A
A = ( 1.0 -3.0
( -3.0 20.0
( 2.0 -5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 20.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix
CALL LFTDS (A, FACT)
Compute the determinant
CALL LFDDS (FACT, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is ’,F6.3,’ * 10**’,F2.0)
END
Output
The determinant of A is 2.100 * 10**1.
LFDDS
Chapter 1: Linear Systems
240
LINDS
more...
more...
Computes the inverse of a real symmetric positive definite matrix.
Required Arguments
A — N by N matrix containing the symmetric positive definite matrix to be inverted. (Input)
Only the upper triangle of A is referenced.
AINV — N by N matrix containing the inverse of A. (Output)
If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N — Order of the matrix A. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDAINV — Leading dimension of AINV exactly as specified in the dimension statement of the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
CALL LINDS (A, AINV [, …])
Specific:
The specific interface names are S_LINDS and D_LINDS.
FORTRAN 77 Interface
Single:
CALL LINDS (N, A, LDA, AINV, LDAINV)
Double:
The double precision name is DLINDS.
ScaLAPACK Interface
Generic:
CALL LINDS (A0, AINV0 [, …])
Specific:
The specific interface names are S_LINDS and D_LINDS.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LINDS
Chapter 1: Linear Systems
241
Description
Routine LINDS computes the inverse of a real symmetric positive definite matrix. The underlying code is
based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are
used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in
the Introduction section of this manual. LINDS first uses the routine LFCDS to compute an RTR factorization
of the coefficient matrix and to estimate the condition number of the matrix. LINRT is then used to compute
R-1. Finally A-1 is computed using A-1 = R-1 R-T.
LINDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in A.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2NDS/DL2NDS. The reference is:
CALL L2NDS (N, A, LDA, AINV, LDAINV, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is not positive definite.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the symmetric positive definite matrix to be inverted. (Input)
AINV0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix AINV.
AINV contains the inverse of the matrix A. (Output)
If A is not needed, A and AINV can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
The inverse is computed for a real positive definite 3 × 3 matrix.
LINDS
Chapter 1: Linear Systems
242
USE LINDS_INT
USE WRRRN_INT
!
Declare variables
LDA, LDAINV
(LDA=3, LDAINV=3)
A(LDA,LDA), AINV(LDAINV,LDAINV)
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
Set values for A
A = ( 1.0 -3.0
( -3.0 10.0
( 2.0 -5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
!
CALL LINDS (A, AINV)
!
Print results
CALL WRRRN (’AINV’, AINV)
!
END
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 × 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, “Utilities”) used to map and unmap arrays to
and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LINDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, LDFACT, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
LINDS
Chapter 1: Linear Systems
243
A(1,:) = (/ 1.0,
A(2,:) = (/ -3.0,
A(3,:) = (/ 2.0,
-3.0, 2.0/)
10.0, -5.0/)
-5.0, 6.0/)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), AINV0(MXLDA,MXCOL))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the routine to get the inverse
CALL LINDS (A0, AINV0)
Unmap the results from the distributed
arrays back to a nondistributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(AINV0, DESCA, AINV)
Print results.
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, AINV0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LINDS
Chapter 1: Linear Systems
244
LSASF
more...
Solves a real symmetric system of linear equations with iterative refinement.
Required Arguments
A — N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced.
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSASF (A, B, X [, …])
Specific:
The specific interface names are S_LSASF and D_LSASF.
FORTRAN 77 Interface
Single:
CALL LSASF (N, A, LDA, B, X)
Double:
The double precision name is DLSASF.
Description
Routine LSASF solves systems of linear algebraic equations having a real symmetric indefinite coefficient
matrix. It first uses the routine LFCSF to compute a U DUT factorization of the coefficient matrix and to estimate the condition number of the matrix. D is a block diagonal matrix with blocks of order 1 or 2, and U is a
matrix composed of the product of a permutation matrix and a unit upper triangular matrix. The solution of
the linear system is then found using the iterative refinement routine LFISF.
LSASF fails if a block in D is singular or if the iterative refinement algorithm fails to converge. These errors
occur only if A is singular or very close to a singular matrix.
LSASF
Chapter 1: Linear Systems
245
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSASF solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ASF/DL2ASF. The reference is
CALL L2ASF (N, A, LDA, B, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — N × N work array containing information about the U DUT factorization of A on output. If
A is not needed, A and FACT can share the same storage location.
IPVT — Integer work vector of length N containing the pivoting information for the factorization
of A on output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ASF the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSASF. Additional memory allocation for FACT and option value restoration are
done automatically in LSASF. Users directly calling L2ASF can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSASF or L2ASF.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine LSASF temporarily replaces IVAL(2) by IVAL(1). The routine L2CSF computes the
condition number if IVAL(2) = 2. Otherwise L2CSF skips this computation. LSASF restores the
option. Default values for the option are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has real symmetric form and the righthand-side vector b has three elements.
USE LSASF_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
LSASF
Chapter 1: Linear Systems
246
!
!
!
!
!
!
!
A = ( 1.0
( -2.0
( 1.0
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
B = (
-4.7
6.5)
4.1
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
!
CALL LSASF (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
Output
X
1
-4.100
2
-3.500
3
1.200
LSASF
Chapter 1: Linear Systems
247
LSLSF
more...
Solves a real symmetric system of linear equations without iterative refinement .
Required Arguments
A — N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced.
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLSF (A, B, X [, …])
Specific:
The specific interface names are S_LSLSF and D_LSLSF.
FORTRAN 77 Interface
Single:
CALL LSLSF (N, A, LDA, B, X)
Double:
The double precision name is DLSLSF.
Description
Routine LSLSF solves systems of linear algebraic equations having a real symmetric indefinite coefficient
matrix. It first uses the routine LFCSF to compute a U DUT factorization of the coefficient matrix. D is a block
diagonal matrix with blocks of order 1 or 2, and U is a matrix composed of the product of a permutation
matrix and a unit upper triangular matrix.
The solution of the linear system is then found using the routine LFSSF.
LSLSF fails if a block in D is singular. This occurs only if A either is singular or is very close to a singular
matrix.
LSLSF
Chapter 1: Linear Systems
248
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LSF/DL2LSF. The reference is:
CALL L2LSF (N, A, LDA, B, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — N × N work array containing information about the U DUT factorization of A on output. If
A is not needed, A and FACT can share the same storage locations.
IPVT — Integer work vector of length N containing the pivoting information for the factorization
of A on output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine LSLSF the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLSF. Additional memory allocation for FACT and option value restoration are
done automatically in LSLSF. Users directly calling LSLSF can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLSF or LSLSF.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLSF temporarily replaces IVAL(2) by IVAL(1). The routine L2CSF computes the condition
number if IVAL(2) = 2. Otherwise L2CSF skips this computation. LSLSF restores the option.
Default values for the option are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has real symmetric form and the righthand-side vector b has three elements.
USE LSLSF_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A = ( 1.0
( -2.0
( 1.0
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
B = (
-4.7
6.5)
4.1
LSLSF
Chapter 1: Linear Systems
249
!
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
!
CALL LSLSF (A, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
Output
X
1
-4.100
2
-3.500
3
1.200
LSLSF
Chapter 1: Linear Systems
250
LFCSF
more...
Computes the U DUT factorization of a real symmetric matrix and estimate its L1 condition number.
Required Arguments
A — N by N symmetric matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing information about the factorization of the symmetric matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the same storage
locations.
IPVT — Vector of length N containing the pivoting information for the factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCSF (A, FACT, IPVT, RCOND [, …])
Specific:
The specific interface names are S_LFCSF and D_LFCSF.
FORTRAN 77 Interface
Single:
CALL LFCSF (N, A, LDA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCSF.
LFCSF
Chapter 1: Linear Systems
251
Description
Routine LFCSF performs a U DUT factorization of a real symmetric indefinite coefficient matrix. It also estimates the condition number of the matrix. The U DUT factorization is called the diagonal pivoting
factorization.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCSF fails if A is singular or very close to a singular matrix.
The U DUT factors are returned in a form that is compatible with routines LFISF, LFSSF and LFDSF. To
solve systems of equations with multiple right-hand-side vectors, use LFCSF followed by either LFISF or
LFSSF called once for each right-hand side. The routine LFDSF can be called to compute the determinant of
the coefficient matrix after LFCSF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see “Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK” in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CSF/DL2CSF. The reference is:
CALL L2CSF (N, A, LDA, FACT, LDFACT, IPVT, RCOND, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is singular.
Example
The inverse of a 3 × 3 matrix is computed. LFCSF is called to factor the matrix and to check for singularity or
ill-conditioning. LFISF is called to determine the columns of the inverse.
USE
USE
USE
USE
LFCSF_INT
UMACH_INT
LFISF_INT
WRRRN_INT
!
Declare variables
PARAMETER
(LDA=3, N=3)
LFCSF
Chapter 1: Linear Systems
252
INTEGER
REAL
IPVT(N), NOUT
A(LDA,LDA), AINV(N,N), FACT(LDA,LDA), RJ(N), RES(N),&
RCOND
!
!
!
!
!
!
!
!
!
!
!
Set values for A
A = ( 1.0
( -2.0
( 1.0
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A and return the reciprocal
condition number estimate
CALL LFCSF (A, FACT, IPVT, RCOND)
Print the estimate of the condition
number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
!
!
matrix one at a time in RJ
RJ = 0.E0
DO 10 J=1, N
RJ(J) = 1.0E0
!
!
!
!
!
RJ is the J-th column of the identity
matrix so the following LFISF
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFISF (A, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = 0.0E0
10 CONTINUE
!
Print the inverse
CALL WRRRN (’AINV’, AINV)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.05
L1 Condition number < 40.0
AINV
1
2
3
1
-2.500
-2.000
-0.500
2
-2.000
-1.000
0.000
3
-0.500
0.000
0.500
LFCSF
Chapter 1: Linear Systems
253
LFTSF
more...
Computes the U DUT factorization of a real symmetric matrix.
Required Arguments
A — N by N symmetric matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing information about the factorization of the symmetric matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the same storage
locations.
IPVT — Vector of length N containing the pivoting information for the factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTSF (A, FACT, IPVT [, …])
Specific:
The specific interface names are S_LFTSF and D_LFTSF.
FORTRAN 77 Interface
Single:
CALL LFTSF (N, A, LDA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTSF.
Description
Routine LFTSF performs a U DUT factorization of a real symmetric indefinite coefficient matrix. The U DUT
factorization is called the diagonal pivoting factorization.
LFTSF fails if A is singular or very close to a singular matrix.
LFTSF
Chapter 1: Linear Systems
254
The U DUT factors are returned in a form that is compatible with routines LFISF, LFSSF and LFDSF. To
solve systems of equations with multiple right-hand-side vectors, use LFTSF followed by either LFISF or
LFSSF called once for each right-hand side. The routine LFDSF can be called to compute the determinant of
the coefficient matrix after LFTSF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
Code
Description
4
2
The input matrix is singular.
Example
The inverse of a 3 × 3 matrix is computed. LFTSF is called to factor the matrix and to check for singularity.
LFSSF is called to determine the columns of the inverse.
USE LFTSF_INT
USE LFSSF_INT
USE WRRRN_INT
!
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), AINV(N,N), FACT(LDA,LDA), RJ(N)
Set values for A
A = ( 1.0 -2.0
( -2.0
3.0
( 1.0 -2.0
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A
CALL LFTSF (A, FACT, IPVT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSSF
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSSF (FACT, IPVT, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
Print the inverse
CALL WRRRN (’AINV’, AINV)
LFTSF
Chapter 1: Linear Systems
255
END
Output
1
2
3
1
-2.500
-2.000
-0.500
AINV
2
-2.000
-1.000
0.000
3
-0.500
0.000
0.500
LFTSF
Chapter 1: Linear Systems
256
LFSSF
more...
Solves a real symmetric system of linear equations given the U DUT factorization of the coefficient matrix.
Required Arguments
FACT — N by N matrix containing the factorization of the coefficient matrix A as output from routine
LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
Only the upper triangle of FACT is used.
IPVT — Vector of length N containing the pivoting information for the factorization of A as output from
routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSSF (FACT, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSSF and D_LFSSF.
FORTRAN 77 Interface
Single:
CALL LFSSF (N, FACT, LDFACT, IPVT, B, X)
Double:
The double precision name is DLFSSF.
Description
Routine LFSSF computes the solution of a system of linear algebraic equations having a real symmetric
indefinite coefficient matrix.
To compute the solution, the coefficient matrix must first undergo a U DUT factorization. This may be done
by calling either LFCSF or LFTSF.
LFSSF
Chapter 1: Linear Systems
257
LFSSF and LFISF both solve a linear system given its U DUT factorization. LFISF generally takes more time
and produces a more accurate answer than LFSSF. Each iteration of the iterative refinement algorithm used
by LFISF calls LFSSF.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see “Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK” in the Introduction section of this manual.
Example
A set of linear systems is solved successively. LFTSF is called to factor the coefficient matrix. LFSSF is called
to compute the four solutions for the four right-hand sides. In this case the coefficient matrix is assumed to be
well-conditioned and correctly scaled. Otherwise, it would be better to call LFCSF to perform the factorization, and LFISF to compute the solutions.
USE LFSSF_INT
USE LFTSF_INT
USE WRRRN_INT
!
Declare variables
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), B(N,4), X(N,4), FACT(LDA,LDA)
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A = ( 1.0
( -2.0
( 1.0
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
B = ( -1.0
( -3.0
( -3.0
3.6
-4.2
-5.2
-8.0 -9.4)
11.0 17.6)
-6.0 -23.4)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/-1.0, -3.0, -3.0, 3.6, -4.2, -5.2, -8.0, 11.0, -6.0,&
-9.4, 17.6, -23.4/
Factor A
CALL LFTSF (A, FACT, IPVT)
Solve for the four right-hand sides
DO 10 I=1, 4
CALL LFSSF (FACT, IPVT, B(:,I), X(:,I))
10 CONTINUE
!
Print results
CALL WRRRN (’X’, X)
END
Output
X
1
1
10.00
2
2.00
3
1.00
4
0.00
LFSSF
Chapter 1: Linear Systems
258
2
3
5.00
-1.00
-3.00
-4.40
5.00
1.00
1.20
-7.00
LFSSF
Chapter 1: Linear Systems
259
LFISF
more...
Uses iterative refinement to improve the solution of a real symmetric system of linear equations.
Required Arguments
A — N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced
FACT — N by N matrix containing the factorization of the coefficient matrix A as output from routine
LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
Only the upper triangle of FACT is used.
IPVT — Vector of length N containing the pivoting information for the factorization of A as output from
routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES — Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFISF (A, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFISF and D_LFISF.
FORTRAN 77 Interface
Single:
CALL LFISF (N, A, LDA, FACT, LDFACT, IPVT, B, X, RES)
Double:
The double precision name is DLFISF.
LFISF
Chapter 1: Linear Systems
260
Description
Routine LFISF computes the solution of a system of linear algebraic equations having a real symmetric
indefinite coefficient matrix. Iterative refinement is performed on the solution vector to improve the accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is somewhat illconditioned.
To compute the solution, the coefficient matrix must first undergo a U DUT factorization. This may be done
by calling either LFCSF or LFTSF.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFISF and LFSSF both solve a linear system given its U DUT factorization. LFISF generally takes more time
and produces a more accurate answer than LFSSF. Each iteration of the iterative refinement algorithm used
by LFISF calls LFSSF.
Comments
Informational error
Type
Code
Description
3
2
The input matrix is too ill-conditioned for iterative refinement to be effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.2 to the second element.
USE
USE
USE
USE
LFISF_INT
UMACH_INT
LFCSF_INT
WRRRN_INT
!
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), B(N), X(N), FACT(LDA,LDA), RES(N), RCOND
Set values for A and B
A = ( 1.0 -2.0
1.0)
( -2.0
3.0 -2.0)
( 1.0 -2.0
3.0)
B = (
4.1
-4.7
6.5)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
Factor A and compute the estimate
of the reciprocal condition number
CALL LFCSF (A, FACT, IPVT, RCOND)
Print condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
LFISF
Chapter 1: Linear Systems
261
!
!
Solve, then perturb right-hand side
DO 10 I=1, 3
CALL LFISF (A, FACT, IPVT, B, X, RES)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
CALL WRRRN (’RES’, RES, 1, N, 1)
B(2) = B(2) + .20E0
10 CONTINUE
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < 0.035
L1 Condition number < 40.0
X
1
-4.100
2
-3.500
3
1.200
RES
1
2
-2.384E-07 -2.384E-07
X
1
2
3
-4.500 -3.700
1.200
RES
1
2
-2.384E-07 -2.384E-07
X
1
2
3
-4.900 -3.900
1.200
RES
1
2
-2.384E-07 -2.384E-07
3
0.000E+00
3
0.000E+00
3
0.000E+00
LFISF
Chapter 1: Linear Systems
262
LFDSF
Computes the determinant of a real symmetric matrix given the U DUT factorization of the matrix.
Required Arguments
FACT — N by N matrix containing the factored matrix A as output from subroutine LFTSF/DLFTSF or
LFCSF/DLFCSF. (Input)
IPVT — Vector of length N containing the pivoting information for the U DUT factorization as output from
routine LFTSF/DLFTSF or LFCSF/DLFCSF. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that, 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form, det(A) = DET1 * 10DET2.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDSF (FACT, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDSF and D_LFDSF.
FORTRAN 77 Interface
Single:
CALL LFDSF (N, FACT, LDFACT, IPVT, DET1, DET2)
Double:
The double precision name is DLFDSF.
Description
Routine LFDSF computes the determinant of a real symmetric indefinite coefficient matrix. To compute the
determinant, the coefficient matrix must first undergo a U DUT factorization. This may be done by calling
either LFCSF or LFTSF. Since det U = ±1, the formula det A = det U det D det UT = det D is used to compute
the determinant. Next det D is computed as the product of the determinants of its blocks.
LFDSF is based on the LINPACK routine SSIDI; see Dongarra et al. (1979).
Example
The determinant is computed for a real symmetric 3 × 3 matrix.
LFDSF
Chapter 1: Linear Systems
263
USE LFDSF_INT
USE LFTSF_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), FACT(LDA,LDA), DET1, DET2
Set values for A
A = ( 1.0 -2.0
( -2.0
3.0
( 1.0 -2.0
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A
CALL LFTSF (A, FACT, IPVT)
!
Compute the determinant
CALL LFDSF (FACT, IPVT, DET1, DET2)
!
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT (’ The determinant of A is ’, F6.3, ’ * 10**’, F2.0)
END
!
Output
The determinant of A is -2.000 * 10**0.
LFDSF
Chapter 1: Linear Systems
264
LSADH
more...
more...
Solves a Hermitian positive definite system of linear equations with iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian positive definite linear system. (Input)
Only the upper triangle of A is referenced.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSADH (A, B, X [, …])
Specific:
The specific interface names are S_LSADH and D_LSADH.
FORTRAN 77 Interface
Single:
CALL LSADH (N, A, LDA, B, X)
Double:
The double precision name is DLSADH.
ScaLAPACK Interface
Generic:
CALL LSADH (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSADH and D_LSADH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSADH
Chapter 1: Linear Systems
265
Description
Routine LSADH solves a system of linear algebraic equations having a complex Hermitian positive definite
coefficient matrix. It first uses the routine LFCDH to compute an RH R Cholesky factorization of the coefficient
matrix and to estimate the condition number of the matrix. The matrix R is upper triangular. The solution of
the linear system is then found using the iterative refinement routine LFIDH.
LSADH fails if any submatrix of R is not positive definite, if R has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A either is very close to a singular matrix or
is a matrix that is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSADH solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ADH/DL2ADH. The reference is:
CALL L2ADH (N, A, LDA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — N × N work array containing the RH R factorization of A on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ADH the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSADH. Additional memory allocation for FACT and option value restoration are
done automatically in LSADH. Users directly calling L2ADH can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSADH or L2ADH.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
LSADH
Chapter 1: Linear Systems
266
17
This option has two values that determine if the L1condition number is to be computed. Routine
LSADH temporarily replaces IVAL(2) by IVAL(1). The routine L2CDH computes the condition
number if IVAL(2) = 2. Otherwise L2CDH skips this computation. LSADH restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the Hermitian positive definite linear system. (Input)
Only the upper triangle of A is referenced.
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of five linear equations is solved. The coefficient matrix has complex positive definite form and the
right-hand-side vector b has five elements.
USE LSADH_INT
USE WRCRN_INT
!
Declare variables
LDA, N
(LDA=5, N=5)
A(LDA,LDA), B(N), X(N)
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 1.0+5.0i
-1.0+1.0i
4.0+0.0i
12.0-6.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
1.0-16.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
-3.0-3.0i
0.0+0.0i )
0.0+0.0i )
0.0+0.0i )
1.0+1.0i )
9.0+0.0i )
25.0+16.0i )
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B /(1.0,5.0), (12.0,-6.0), (1.0,-16.0), (-3.0,-3.0),&
(25.0,16.0)/
LSADH
Chapter 1: Linear Systems
267
!
CALL LSADH (A, B, X)
!
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
ScaLAPACK Example
The same system of five linear equations is solved as a distributed computing example. The coefficient
matrix has complex positive definite form and the right-hand-side vector b has five elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSADH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0,
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0,
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0,
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0,
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
!
!
!
B = (/(1.0, 5.0),(12.0, -6.0),(1.0, -16.0),(-3.0, -3.0),(25.0, 16.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
LSADH
Chapter 1: Linear Systems
268
!
!
!
!
!
!
!
!
!
!
!
!
!
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSADH (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRCRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSADH
Chapter 1: Linear Systems
269
LSLDH
more...
more...
Solves a complex Hermitian positive definite system of linear equations without iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian positive definite linear system. (Input)
Only the upper triangle of A is referenced.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLDH (A, B, X [, …])
Specific:
The specific interface names are S_LSLDH and D_LSLDH.
FORTRAN 77 Interface
Single:
CALL LSLDH (N, A, LDA, B, X)
Double:
The double precision name is DLSLDH.
ScaLAPACK Interface
Generic:
CALL LSLDH (A0, B0, X0 [, …])
Specific:
The specific interface names are S_LSLDH and D_LSLDH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LSLDH
Chapter 1: Linear Systems
270
Description
Routine LSLDH solves a system of linear algebraic equations having a complex Hermitian positive definite
coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see “Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. LSLDH first uses
the routine LFCDH to compute an RH R Cholesky factorization of the coefficient matrix and to estimate the
condition number of the matrix. The matrix R is upper triangular. The solution of the linear system is then
found using the routine LFSDH.
LSLDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSADH be used.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LDH/ DL2LDH. The reference is:
CALL L2LDH (N, A, LDA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — N × N work array containing the RH R factorization of A on output. If A is not needed, A can
share the same storage locations as FACT.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LDH the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLDH. Additional memory allocation for FACT and option value restoration are
done automatically in LSLDH. Users directly calling L2LDH can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLDH or L2LDH.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
LSLDH
Chapter 1: Linear Systems
271
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLDH temporarily replaces IVAL(2) by IVAL(1). The routine L2CDH computes the condition
number if IVAL(2) = 2. Otherwise L2CDH skips this computation. LSLDH restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the Hermitian positive definite linear system. (Input)
Only the upper triangle of A is referenced.
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive definite
form and the right-hand-side vector b has five elements.
USE LSLDH_INT
USE WRCRN_INT
!
Declare variables
LDA, N
(LDA=5, N=5)
A(LDA,LDA), B(N), X(N)
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 1.0+5.0i
-1.0+1.0i
4.0+0.0i
12.0-6.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
1.0-16.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
-3.0-3.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
25.0+16.0i )
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B /(1.0,5.0), (12.0,-6.0), (1.0,-16.0), (-3.0,-3.0),&
LSLDH
Chapter 1: Linear Systems
272
(25.0,16.0)/
!
CALL LSLDH (A, B, X)
!
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
ScaLAPACK Example
The same system of five linear equations is solved as a distributed computing example. The coefficient
matrix has complex positive definite form and the right-hand-side vector b has five elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0,
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0,
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0,
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0,
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
!
!
!
B = (/(1.0, 5.0),(12.0, -6.0),(1.0, -16.0),(-3.0, -3.0),(25.0, 16.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
LSLDH
Chapter 1: Linear Systems
273
!
!
!
!
!
!
!
!
!
!
!
!
!
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCX, B0)
Solve the system of equations
CALL LSLDH (A0, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRCRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSLDH
Chapter 1: Linear Systems
274
LFCDH
more...
more...
Computes the RH R factorization of a complex Hermitian positive definite matrix and estimate its L1 condition number.
Required Arguments
A — Complex N by N Hermitian positive definite matrix to be factored. (Input) Only the upper triangle of
A is referenced.
FACT — Complex N by N matrix containing the upper triangular matrix R of the factorization of A in the
upper triangle. (Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT --- Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCDH (A, FACT, RCOND [, …])
Specific:
The specific interface names are S_LFCDH and D_LFCDH.
FORTRAN 77 Interface
Single:
CALL LFCDH (N, A, LDA, FACT, LDFACT, RCOND)
Double:
The double precision name is DLFCDH.
LFCDH
Chapter 1: Linear Systems
275
ScaLAPACK Interface
Generic:
CALL LFCDH (A0, FACT0, RCOND [, …])
Specific:
The specific interface names are S_LFCDH and D_LFCDH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFCDH computes an RH R Cholesky factorization and estimates the condition number of a complex
Hermitian positive definite coefficient matrix. The matrix R is upper triangular.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RH R factors are returned in a form that is compatible with routines LFIDH, LFSDH and LFDDH. To solve
systems of equations with multiple right-hand-side vectors, use LFCDH followed by either LFIDH or LFSDH
called once for each right-hand side. The routine LFDDH can be called to compute the determinant of the coefficient matrix after LFCDH has performed the factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CDH/DL2CDH. The reference is:
CALL L2CDH (N, A, LDA, FACT, LDFACT, RCOND, WK)
The additional argument is
WK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
4
The input matrix is not Hermitian.
4
2
The input matrix is not positive definite. It has a diagonal entry with an
imaginary part
LFCDH
Chapter 1: Linear Systems
276
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A
contains the Hermitian positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix
FACT. FACT contains the upper triangular matrix R of the factorization of A in the upper triangle.
(Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
Examples
Example
The inverse of a 5 × 5 Hermitian positive definite matrix is computed. LFCDH is called to factor the matrix
and to check for nonpositive definiteness or ill-conditioning. LFIDH is called to determine the columns of the
inverse.
USE
USE
USE
USE
LFCDH_INT
LFIDH_INT
UMACH_INT
WRCRN_INT
!
Declare variables
LDA, LDFACT, N, NOUT
(LDA=5, LDFACT=5, N=5)
RCOND
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT),&
RES(N), RJ(N)
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A
A =
(
(
(
(
(
2.0+0.0i
-1.0+1.0i
4.0+0.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
Factor the matrix A
CALL LFCDH (A, FACT, RCOND)
Set up the columns of the identity
matrix one at a time in RJ
LFCDH
Chapter 1: Linear Systems
277
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
!
!
!
!
!
!
RJ is the J-th column of the identity
matrix so the following LFIDH
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIDH (A, FACT, RJ, AINV(:,J), RES)
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
CALL WRCRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < 0.075
L1 Condition number < 25.0
1
2
3
4
5
1
2
3
4
5
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
ScaLAPACK Example
The inverse of the same 5 × 5 Hermitian positive definite matrix in the preceding example is computed as a
distributed computing example. LFCDH is called to factor the matrix and to check for nonpositive definiteness or ill-conditioning. LFIDH is called to determine the columns of the inverse. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap arrays to and from the
processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the
descriptors for the local arrays.
USE
USE
USE
USE
USE
MPI_SETUP_INT
LFCDH_INT
LFIDH_INT
WRCRN_INT
SCALAPACK_SUPPORT
LFCDH
Chapter 1: Linear Systems
278
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), RJ(:), RJ0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), X0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFCDH (A0, FACT0, RCOND)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
CALL SCALAPACK_MAP(RJ, DESCX, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIDH
reference solves for the J-th column of
the inverse of A
CALL LFIDH (A0, FACT0, RJ0, X0, RES0)
Unmap the results from the distributed
array back to a non-distributed array
CALL SCALAPACK_UNMAP(X0, DESCX, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print the results.
After the unmap, only Rank=0 has the full
array.
LFCDH
Chapter 1: Linear Systems
279
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
CALL WRCRN (’AINV’, AINV)
ENDIF
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, RJ, RJ0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.075
L1 Condition number < 25.0
1
2
3
4
5
1
2
3
4
5
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
LFCDH
Chapter 1: Linear Systems
280
LFTDH
more...
more...
Computes the RHR factorization of a complex Hermitian positive definite matrix.
Required Arguments
A — Complex N by N Hermitian positive definite matrix to be factored. (Input) Only the upper triangle of
A is referenced.
FACT — Complex N by N matrix containing the upper triangular matrix R of the factorization of A in the
upper triangle. (Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTDH (A, FACT [, …])
Specific:
The specific interface names are S_LFTDH and D_LFTDH.
FORTRAN 77 Interface
Single:
CALL LFTDH (N, A, LDA, FACT, LDFACT)
Double:
The double precision name is DLFTDH.
ScaLAPACK Interface
Generic:
CALL LFTDH (A0, FACT0 [, …])
Specific:
The specific interface names are S_LFTDH and D_LFTDH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFTDH
Chapter 1: Linear Systems
281
Description
Routine LFTDH computes an RH R Cholesky factorization of a complex Hermitian positive definite coefficient
matrix. The matrix R is upper triangular.
LFTDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RH R factors are returned in a form that is compatible with routines LFIDH, LFSDH and LFDDH. To solve
systems of equations with multiple right-hand-side vectors, use LFCDH followed by either LFIDH or LFSDH
called once for each right-hand side. The IMSL routine LFDDH can be called to compute the determinant of
the coefficient matrix after LFCDH has performed the factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational errors
Type
Code
Description
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A
contains the Hermitian positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT0 — Complex MXLDA by MXCOL local matrix containing the local portions of the distributed matrix
FACT. FACT contains the upper triangular matrix R of the factorization of A in the upper triangle.
(Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LFTDH
Chapter 1: Linear Systems
282
Examples
Example
The inverse of a 5 × 5 matrix is computed. LFTDH is called to factor the matrix and to check for nonpositive
definiteness. LFSDH is called to determine the columns of the inverse.
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
!
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A
A =
(
(
(
(
(
2.0+0.0i
-1.0+1.0i
4.0+0.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
Factor the matrix A
CALL LFTDH (A, FACT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
RJ is the J-th column of the identity
matrix so the following LFSDH
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSDH (FACT, RJ, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print the results
CALL WRCRN (’AINV’, AINV, ITRING=1)
!
END
Output
AINV
1
2
3
4
1 ( 0.7166, 0.0000) ( 0.2166,-0.2166) (-0.0899,-0.0300) (-0.0207, 0.0622)
LFTDH
Chapter 1: Linear Systems
283
2
3
4
1
2
3
4
5
( 0.4332, 0.0000)
(-0.0599,-0.1198)
( 0.1797, 0.0000)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
ScaLAPACK Example
The inverse of the same 5 × 5 Hermitian positive definite matrix in the preceding example is computed as a
distributed computing example. LFTDH is called to factor the matrix and to check for nonpositive definiteness. LFSDH is called to determine the columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are
IMSL utility routines (see Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), RJ(:), RJ0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), X0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
LFTDH
Chapter 1: Linear Systems
284
RJ0(MXLDA))
!
!
!
!
!
!
!
!
!
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFTDH (A0, FACT0)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
CALL SCALAPACK_MAP(RJ, DESCX, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIDH
reference solves for the J-th column of
the inverse of A
CALL LFSDH (FACT0, RJ0, X0)
Unmap the results from the distributed
array back to a non-distributed array
CALL SCALAPACK_UNMAP(X0, DESCX, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
!
!
!
!
!
Print the results.
After the unmap, only Rank=0 has the full
array.
IF(MP_RANK .EQ. 0) CALL WRCRN (’AINV’, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, RJ, RJ0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
1
2
3
4
5
1
2
3
6
7
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
LFTDH
Chapter 1: Linear Systems
285
LFSDH
more...
more...
Solves a complex Hermitian positive definite system of linear equations given the RH R factorization of the
coefficient matrix.
Required Arguments
FACT — Complex N by N matrix containing the factorization of the coefficient matrix A as output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSDH (FACT, B, X [, …])
Specific:
The specific interface names are S_LFSDH and D_LFSDH.
FORTRAN 77 Interface
Single:
CALL LFSDH (N, FACT, LDFACT, B, X)
Double:
The double precision name is DLFSDH.
ScaLAPACK Interface
Generic:
CALL LFSDH (FACT0, B0, X0 [, …])
Specific:
The specific interface names are S_LFSDH and D_LFSDH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
LFSDH
Chapter 1: Linear Systems
286
Description
Routine LFSDH computes the solution for a system of linear algebraic equations having a complex Hermitian
positive definite coefficient matrix. To compute the solution, the coefficient matrix must first undergo an RHR
factorization. This may be done by calling either LFCDH or LFTDH. R is an upper triangular matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
LFSDH and LFIDH both solve a linear system given its RHR factorization. LFIDH generally takes more time
and produces a more accurate answer than LFSDH. Each iteration of the iterative refinement algorithm used
by LFIDH calls LFSDH.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
Code
Description
4
1
The input matrix is singular.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT as output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. FACT contains the factorization of
the matrix A. (Input)
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (Utilities) has been made. See the ScaLAPACK Example below.
Examples
Example
A set of linear systems is solved successively. LFTDH is called to factor the coefficient matrix. LFSDH is called
to compute the four solutions for the four right-hand sides. In this case, the coefficient matrix is assumed to
be well-conditioned and correctly scaled. Otherwise, it would be better to call LFCDH to perform the factorization, and LFIDH to compute the solutions.
LFSDH
Chapter 1: Linear Systems
287
USE LFSDH_INT
USE LFTDH_INT
USE WRCRN_INT
!
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
A(LDA,LDA), B(N,3), FACT(LDFACT,LDFACT), X(N,3)
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 3.0+3.0i
( 5.0-5.0i
( 5.0+4.0i
( 9.0+7.0i
(-22.0+1.0i
-1.0+1.0i
4.0+0.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
4.0+0.0i
15.0-10.0i
-12.0-56.0i
-12.0+10.0i
3.0-1.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i )
0.0+0.0i )
0.0+0.0i )
1.0+1.0i )
9.0+0.0i )
29.0-9.0i )
-36.0-17.0i )
-15.0-24.0i )
-23.0-15.0i )
-23.0-28.0i )
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B /(3.0,3.0), (5.0,-5.0), (5.0,4.0), (9.0,7.0), (-22.0,1.0),&
(4.0,0.0), (15.0,-10.0), (-12.0,-56.0), (-12.0,10.0),&
(3.0,-1.0), (29.0,-9.0), (-36.0,-17.0), (-15.0,-24.0),&
(-23.0,-15.0), (-23.0,-28.0)/
!
Factor the matrix A
CALL LFTDH (A, FACT)
!
!
Compute the solutions
DO 10 I=1, 3
CALL LFSDH (FACT, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRCRN (’X’, X)
!
END
Output
X
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFSDH
Chapter 1: Linear Systems
288
ScaLAPACK Example
The same set of linear systems as in in the preceding example is solved successively as a distributed computing example. LFTDH is called to factor the matrix. LFSDH is called to compute the four solutions for the four
right-hand sides. In this case, the coefficient matrix is assumed to be well-conditioned and correctly scaled.
Otherwise, it would be better to call LFCDH to perform the factorization, and LFIDH to compute the
solutions.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:,:), B0(:), X(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), X0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(LDA,3), X(LDA,3))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0,
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0,
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0,
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0,
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
B(1,:)
B(2,:)
B(3,:)
B(4,:)
B(5,:)
ENDIF
!
!
!
!
!
!
=
=
=
=
=
(/(3.0, 3.0),
(/(5.0, -5.0),
(/(5.0, 4.0),
(/(9.0, 7.0),
(/(-22.0,1.0),
( 4.0, 0.0),
( 15.0,-10.0),
(-12.0,-56.0),
(-12.0, 10.0),
( 3.0, -1.0),
( 29.0, -9.0)/)
(-36.0,-17.0)/)
(-15.0,-24.0)/)
(-23.0,-15.0)/)
(-23.0,-28.0)/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), &
B0(MXLDA))
LFSDH
Chapter 1: Linear Systems
289
!
!
!
!
!
!
!
!
!
!
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFTDH (A0, FACT0)
Compute the solutions
DO 10 J=1, 3
CALL SCALAPACK_MAP(B(:,J), DESCX, B0)
CALL LFSDH (FACT0, B0, X0)
Unmap the results from the distributed
array back to a non-distributed array
CALL SCALAPACK_UNMAP(X0, DESCX, X(:,J))
10 CONTINUE
Print the results.
After the unmap, only Rank=0 has the full
array.
IF(MP_RANK .EQ. 0) CALL WRCRN (’X’, X)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, FACT0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFSDH
Chapter 1: Linear Systems
290
LFIDH
more...
more...
Uses iterative refinement to improve the solution of a complex Hermitian positive definite system of linear
equations.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the linear system. (Input)
Only the upper triangle of A is referenced.
FACT — Complex N by N matrix containing the factorization of the coefficient matrix A as output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
Only the upper triangle of FACT is used.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution. (Output)
RES — Complex vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFIDH (A, FACT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIDH and D_LFIDH.
FORTRAN 77 Interface
Single:
CALL LFIDH (N, A, LDA, FACT, LDFACT, B, X, RES)
Double:
The double precision name is DLFIDH.
ScaLAPACK Interface
Generic:
CALL LFIDH (A0, FACT0, B0, X0, RES0 [, …])
LFIDH
Chapter 1: Linear Systems
291
Specific:
The specific interface names are S_LFIDH and D_LFIDH.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LFIDH computes the solution of a system of linear algebraic equations having a complex Hermitian
positive definite coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is somewhat illconditioned.
To compute the solution, the coefficient matrix must first undergo an RH R factorization. This may be done by
calling either LFCDH or LFTDH.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIDH and LFSDH both solve a linear system given its RH R factorization. LFIDH generally takes more time
and produces a more accurate answer than LFSDH. Each iteration of the iterative refinement algorithm used
by LFIDH calls LFSDH.
Comments
Informational error
Type
Code
Description
3
3
The input matrix is too ill-conditioned for iterative refinement to be effective.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix A. A
contains the coefficient matrix of the linear system. (Input)
Only the upper triangle of A is referenced.
FACT0 — MXLDA by MXCOL complex local matrix containing the local portions of the distributed matrix
FACT as output from routine LFCDH or LFTDH. FACT contains the factorization of the matrix A. (Input)
Only the upper triangle of FACT is referenced.
B0 — Complex local vector of length MXLDA containing the local portions of the distributed vector B. B
contains the right-hand side of the linear system. (Input)
X0 — Complex local vector of length MXLDA containing the local portions of the distributed vector X. X
contains the solution to the linear system. (Output)
RES0 — Complex local vector of length MXLDA containing the local portions of the distributed vector
RES. RES contains the residual vector at the improved solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(Utilities) after a call to SCALAPACK_SETUP
(Chapter 11, ”Utilities”) has been made. See the ScaLAPACK Example below.
LFIDH
Chapter 1: Linear Systems
292
Examples
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed by adding (1 + i)/2 to
the second element after each call to LFIDH.
USE
USE
USE
USE
LFIDH_INT
LFCDH_INT
UMACH_INT
WRCRN_INT
!
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
RCOND
A(LDA,LDA), B(N), FACT(LDFACT,LDFACT), RES(N,3), X(N,3)
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 3.0+3.0i
-1.0+1.0i
4.0+0.0i
5.0-5.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
5.0+4.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
9.0+7.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
-22.0+1.0i )
DATA A /(2.0,0.0), 4*(0.0,0.0), (-1.0,1.0), (4.0,0.0),&
4*(0.0,0.0), (1.0,2.0), (10.0,0.0), 4*(0.0,0.0),&
(0.0,4.0), (6.0,0.0), 4*(0.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B /(3.0,3.0), (5.0,-5.0), (5.0,4.0), (9.0,7.0), (-22.0,1.0)/
Factor the matrix A
CALL LFCDH (A, FACT, RCOND)
Print the estimated condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Compute the solutions, then perturb B
DO 10 I=1, 3
CALL LFIDH (A, FACT, B, X(:,I), RES(:,I))
B(2) = B(2) + (0.5E0,0.5E0)
10 CONTINUE
Print solutions and residuals
CALL WRCRN (’X’, X)
CALL WRCRN (’RES’, RES)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < 0.07
L1 Condition number < 25.0
X
LFIDH
Chapter 1: Linear Systems
293
1
2
3
4
5
1
( 1.000, 0.000)
( 1.000,-2.000)
( 2.000, 0.000)
( 2.000, 3.000)
(-3.000, 0.000)
2
( 1.217, 0.000)
( 1.217,-1.783)
( 1.910, 0.030)
( 1.979, 2.938)
(-2.991, 0.005)
3
( 1.433, 0.000)
( 1.433,-1.567)
( 1.820, 0.060)
( 1.959, 2.876)
(-2.982, 0.009)
RES
1
2
3
4
5
1
( 1.192E-07, 0.000E+00)
( 1.192E-07,-2.384E-07)
( 2.384E-07, 8.259E-08)
(-2.384E-07, 2.814E-14)
(-2.384E-07,-1.401E-08)
2
( 6.592E-08, 1.686E-07)
(-5.329E-08,-5.329E-08)
( 2.390E-07,-3.309E-08)
(-8.240E-08,-8.790E-09)
(-2.813E-07, 6.981E-09)
3
( 1.318E-07, 2.010E-14)
( 1.318E-07,-2.258E-07)
( 2.395E-07, 1.015E-07)
(-1.648E-07,-1.758E-08)
(-3.241E-07,-2.795E-08)
ScaLAPACK Example
As in the preceding example, a set of linear systems is solved successively as a distributed computing example. The right-hand-side vector is perturbed by adding (1 + i)/2 to the second element after each call to
LFIDH. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and
unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCDH_INT
USE LFIDH_INT
USE UMACH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), B0(:), RES(:,:), X(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), X0(:), RES0(:)
PARAMETER (LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), RES(N,3), X(N,3))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
!
B
ENDIF
!
!
= (/(3.0, 3.0),( 5.0,-5.0),( 5.0, 4.0),(9.0, 7.0),(-22.0,1.0)/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
LFIDH
Chapter 1: Linear Systems
294
!
!
!
!
!
!
!
!
!
!
!
!
!
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), &
B0(MXLDA), RES0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFCDH (A0, FACT0, RCOND)
Print the estimated condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
ENDIF
Compute the solutions
DO 10 J=1, 3
CALL SCALAPACK_MAP(B, DESCX, B0)
CALL LFIDH (A0, FACT0, B0, X0, RES0)
Unmap the results from the distributed
array back to a non-distributed array
CALL SCALAPACK_UNMAP(X0, DESCX, X(:,J))
CALL SCALAPACK_UNMAP(RES0, DESCX, RES(:,J))
IF(MP_RANK .EQ. 0) B(2) = B(2) + (0.5E0, 0.5E0)
10 CONTINUE
Print the results.
After the unmap, only Rank=0 has the full
array.
IF(MP_RANK .EQ. 0) THEN
CALL WRCRN (’X’, X)
CALL WRCRN (’RES’, RES)
ENDIF
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, RES, X)
DEALLOCATE(A0, B0, FACT0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND < 0.07
L1 Condition number < 25.0
X
1
2
1 ( 1.000, 0.000) ( 1.217, 0.000)
2 ( 1.000,-2.000) ( 1.217,-1.783)
3 ( 2.000, 0.000) ( 1.910, 0.030)
3
( 1.433, 0.000)
( 1.433,-1.567)
( 1.820, 0.060)
LFIDH
Chapter 1: Linear Systems
295
4
5
1
2
3
4
5
( 2.000, 3.000)
(-3.000, 0.000)
( 1.979, 2.938) ( 1.959, 2.876)
(-2.991, 0.005) (-2.982, 0.009)
RES
1
2
3
( 1.192E-07, 0.000E+00) ( 6.592E-08, 1.686E-07) ( 1.318E-07, 2.010E-14)
( 1.192E-07,-2.384E-07) (-5.329E-08,-5.329E-08) ( 1.318E-07,-2.258E-07)
( 2.384E-07, 8.259E-08) ( 2.390E-07,-3.309E-08) ( 2.395E-07, 1.015E-07)
(-2.384E-07, 2.814E-14) (-8.240E-08,-8.790E-09) (-1.648E-07,-1.758E-08)
(-2.384E-07,-1.401E-08) (-2.813E-07, 6.981E-09) (-3.241E-07,-2.795E-08)
LFIDH
Chapter 1: Linear Systems
296
LFDDH
Computes the determinant of a complex Hermitian positive definite matrix given the RHR Cholesky factorization of the matrix.
Required Arguments
FACT — Complex N by N matrix containing the RHR factorization of the coefficient matrix A as output from
routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDDH (FACT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDDH and D_LFDDH.
FORTRAN 77 Interface
Single:
CALL LFDDH (N, FACT, LDFACT, DET1, DET2)
Double:
The double precision name is DLFDDH.
Description
Routine LFDDH computes the determinant of a complex Hermitian positive definite coefficient matrix. To
compute the determinant, the coefficient matrix must first undergo an RH R factorization. This may be done
by calling either LFCDH or LFTDH. The formula det A = det RH det R = (det R)2 is used to compute the determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
GHW5
š
1
L 5LL
(The matrix R is stored in the upper triangle of FACT.)
LFDDH is based on the LINPACK routine CPODI; see Dongarra et al. (1979).
LFDDH
Chapter 1: Linear Systems
297
Example
The determinant is computed for a complex Hermitian positive definite 3 × 3 matrix.
USE LFDDH_INT
USE LFTDH_INT
USE UMACH_INT
!
Declare variables
LDA, LDFACT, NOUT
(LDA=3, LDFACT=3)
DET1, DET2
A(LDA,LDA), FACT(LDFACT,LDFACT)
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
Set values for A
A =
(
(
(
6.0+0.0i
1.0+1.0i
4.0+0.0i
1.0-1.0i
7.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
11.0+0.0i )
DATA A /(6.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (7.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (11.0,0.0)/
Factor the matrix
CALL LFTDH (A, FACT)
Compute the determinant
CALL LFDDH (FACT, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is ’,F6.3,’ * 10**’,F2.0)
END
Output
The determinant of A is
1.400 * 10**2.
LFDDH
Chapter 1: Linear Systems
298
LSAHF
more...
Solves a complex Hermitian system of linear equations with iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian linear system. (Input)
Only the upper triangle of A is referenced.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSAHF (A, B, X [, …])
Specific:
The specific interface names are S_LSAHF and D_LSAHF.
FORTRAN 77 Interface
Single:
CALL LSAHF (N, A, LDA, B, X)
Double:
The double precision name is DLSAHF.
Description
Routine LSAHF solves systems of linear algebraic equations having a complex Hermitian indefinite coefficient matrix. It first uses the routine LFCHF to compute a U DUH factorization of the coefficient matrix and to
estimate the condition number of the matrix. D is a block diagonal matrix with blocks of order 1 or 2 and U is
a matrix composed of the product of a permutation matrix and a unit upper triangular matrix. The solution
of the linear system is then found using the iterative refinement routine LFIHF.
LSAHF fails if a block in D is singular or if the iterative refinement algorithm fails to converge. These errors
occur only if A is singular or very close to a singular matrix.
LSAHF
Chapter 1: Linear Systems
299
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSAHF solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2AHF/DL2AHF. The reference is:
CALL L2AHF (N, A, LDA, B, X, FACT, IPVT, CWK)
The additional arguments are as follows:
FACT — Complex work vector of length N2 containing information about the U DUH factorization
of A on output.
IPVT — Integer work vector of length N containing the pivoting information for the factorization
of A on output.
CWK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix singular.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2AHF the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSAHF. Additional memory allocation for FACT and option value restoration are
done automatically in LSAHF. Users directly calling L2AHF can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSAHF or L2AHF.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSAHF temporarily replaces IVAL(2) by IVAL(1). The routine L2CHF computes the condition
number if IVAL(2) = 2. Otherwise L2CHF skips this computation. LSAHF restores the option.
Default values for the option are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has complex Hermitian form and the
right-hand-side vector b has three elements.
USE LSAHF_INT
USE WRCRN_INT
LSAHF
Chapter 1: Linear Systems
300
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
Set values for A and B
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
B = ( 7.0+32.0i -39.0-21.0i 51.0+9.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
DATA B/(7.0,32.0), (-39.0,-21.0), (51.0,9.0)/
!
CALL LSAHF (A, B, X)
!
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
Output
X
(
2.00,
1
1.00)
2
(-10.00, -1.00)
(
3.00,
3
5.00)
LSAHF
Chapter 1: Linear Systems
301
LSLHF
more...
Solves a complex Hermitian system of linear equations without iterative refinement.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian linear system. (Input)
Only the upper triangle of A is referenced.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLHF (A, B, X [, …])
Specific:
The specific interface names are S_LSLHF and D_LSLHF.
FORTRAN 77 Interface
Single:
CALL LSLHF (N, A, LDA, B, X)
Double:
The double precision name is DLSLHF.
Description
Routine LSLHF solves systems of linear algebraic equations having a complex Hermitian indefinite coefficient matrix. It first uses the routine LFCHF to compute a UDUH factorization of the coefficient matrix. D is a
block diagonal matrix with blocks of order 1 or 2 and U is a matrix composed of the product of a permutation
matrix and a unit upper triangular matrix.
The solution of the linear system is then found using the routine LFSHF. LSLHF fails if a block in D is singular. This occurs only if A is singular or very close to a singular matrix. If the coefficient matrix is illconditioned or poorly scaled, it is recommended that LSAHF be used.
LSLHF
Chapter 1: Linear Systems
302
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LHF/DL2LHF. The reference is:
CALL L2LHF (N, A, LDA, B, X, FACT, IPVT, CWK)
The additional arguments are as follows:
FACT — Complex work vector of length N2 containing information about the UDUH factorization
of A on output.
IPVT — Integer work vector of length N containing the pivoting information for the factorization
of A on output.
CWK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix singular.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LHF the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLHF. Additional memory allocation for FACT and option value restoration are
done automatically in LSLHF. Users directly calling L2LHF can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLHF or L2LHF.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLHF temporarily replaces IVAL(2) by IVAL(1). The routine L2CHF computes the condition
number if IVAL(2) = 2. Otherwise L2CHF skips this computation. LSLHF restores the option.
Default values for the option are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has complex Hermitian form and the
right-hand-side vector b has three elements.
USE LSLHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
LSLHF
Chapter 1: Linear Systems
303
!
!
!
!
!
!
!
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
B = ( 7.0+32.0i -39.0-21.0i 51.0+9.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
DATA B/(7.0,32.0), (-39.0,-21.0), (51.0,9.0)/
!
CALL LSLHF (A, B, X)
!
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
Output
X
(
2.00,
1
1.00)
2
(-10.00, -1.00)
(
3.00,
3
5.00)
LSLHF
Chapter 1: Linear Systems
304
LFCHF
more...
Computes the UDUH factorization of a complex Hermitian matrix and estimate its L1 condition number.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian linear system. (Input)
Only the upper triangle of A is referenced.
FACT — Complex N by N matrix containing the information about the factorization of the Hermitian
matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the same storage
locations.
IPVT — Vector of length N containing the pivoting information for the factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCHF (A, FACT, IPVT, RCOND [, …])
Specific:
The specific interface names are S_LFCHF and D_LFCHF.
FORTRAN 77 Interface
Single:
CALL LFCHF (N, A, LDA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCHF.
LFCHF
Chapter 1: Linear Systems
305
Description
Routine LFCHF performs a U DUH factorization of a complex Hermitian indefinite coefficient matrix. It also
estimates the condition number of the matrix. The U DUH factorization is called the diagonal pivoting
factorization.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCHF fails if A is singular or very close to a singular matrix.
The U DUH factors are returned in a form that is compatible with routines LFIHF, LFSHF and LFDHF. To
solve systems of equations with multiple right-hand-side vectors, use LFCHF followed by either LFIHF or
LFSHF called once for each right-hand side. The routine LFDHF can be called to compute the determinant of
the coefficient matrix after LFCHF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CHF/DL2CHF. The reference is:
CALL L2CHF (N, A, LDA, FACT, LDFACT, IPVT, RCOND, CWK)
The additional argument is:
CWK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is singular.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Example
The inverse of a 3 × 3 complex Hermitian matrix is computed. LFCHF is called to factor the matrix and to
check for singularity or ill-conditioning. LFIHF is called to determine the columns of the inverse.
USE LFCHF_INT
LFCHF
Chapter 1: Linear Systems
306
USE UMACH_INT
USE LFIHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), AINV(LDA,N), FACT(LDA,LDA), RJ(N), RES(N)
Set values for A
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
Set output unit number
CALL UMACH (2, NOUT)
Factor A and return the reciprocal
condition number estimate
CALL LFCHF (A, FACT, IPVT, RCOND)
Print the estimate of the condition
number
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0, 0.0E0)
RJ is the J-th column of the identity
matrix so the following LFIHF
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIHF (A, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = (0.0E0, 0.0E0)
10 CONTINUE
Print the inverse
CALL WRCRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < 0.25
L1 Condition number <
6.0
AINV
1
2
3
1
( 0.2000, 0.0000)
( 0.1200,-0.0400)
( 0.0800, 0.0400)
2
( 0.1200, 0.0400)
( 0.1467, 0.0000)
(-0.1267, 0.0067)
3
( 0.0800,-0.0400)
(-0.1267,-0.0067)
(-0.0267, 0.0000)
LFCHF
Chapter 1: Linear Systems
307
LFTHF
more...
Computes the U DUH factorization of a complex Hermitian matrix.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian linear system. (Input)
Only the upper triangle of A is referenced.
FACT — Complex N by N matrix containing the information about the factorization of the Hermitian
matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the same storage
locations.
IPVT — Vector of length N containing the pivoting information for the factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTHF (A, FACT, IPVT [, …])
Specific:
The specific interface names are S_LFTHF and D_LFTHF.
FORTRAN 77 Interface
Single:
CALL LFTHF (N, A, LDA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTHF.
LFTHF
Chapter 1: Linear Systems
308
Description
Routine LFTHF performs a U DUH factorization of a complex Hermitian indefinite coefficient matrix. The
UDUH factorization is called the diagonal pivoting factorization.
LFTHF fails if A is singular or very close to a singular matrix.
The U DUH factors are returned in a form that is compatible with routines LFIHF, LFSHF and LFDHF. To
solve systems of equations with multiple right-hand-side vectors, use LFTHF followed by either LFIHF or
LFSHF called once for each right-hand side. The routine LFDHF can be called to compute the determinant of
the coefficient matrix after LFTHF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Comments
Informational errors
Type
Code
Description
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is singular.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Example
The inverse of a 3 × 3 matrix is computed. LFTHF is called to factor the matrix and check for singularity.
LFSHF is called to determine the columns of the inverse.
USE LFTHF_INT
USE LFSHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
!
!
!
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), AINV(LDA,N), FACT(LDA,LDA), RJ(N)
Set values for A
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
Factor A
LFTHF
Chapter 1: Linear Systems
309
CALL LFTHF (A, FACT, IPVT)
!
!
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0, 0.0E0)
!
!
!
!
!
!
RJ is the J-th column of the identity
matrix so the following LFSHF
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSHF (FACT, IPVT, RJ, AINV(:,J))
RJ(J) = (0.0E0, 0.0E0)
10 CONTINUE
Print the inverse
CALL WRCRN (’AINV’, AINV)
END
Output
AINV
1
2
3
1
( 0.2000, 0.0000)
( 0.1200,-0.0400)
( 0.0800, 0.0400)
2
( 0.1200, 0.0400)
( 0.1467, 0.0000)
(-0.1267, 0.0067)
3
( 0.0800,-0.0400)
(-0.1267,-0.0067)
(-0.0267, 0.0000)
LFTHF
Chapter 1: Linear Systems
310
LFSHF
more...
Solves a complex Hermitian system of linear equations given the U DUH factorization of the coefficient
matrix.
Required Arguments
FACT — Complex N by N matrix containing the factorization of the coefficient matrix A as output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT — Vector of length N containing the pivoting information for the factorization of A as output from
routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSHF (FACT, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSHF and D_LFSHF.
FORTRAN 77 Interface
Single:
CALL LFSHF (N, FACT, LDFACT, IPVT, B, X)
Double:
The double precision name is DLFSHF.
Description
Routine LFSHF computes the solution of a system of linear algebraic equations having a complex Hermitian
indefinite coefficient matrix.
LFSHF
Chapter 1: Linear Systems
311
To compute the solution, the coefficient matrix must first undergo a U DUH factorization. This may be done
by calling either LFCHF or LFTHF.
LFSHF and LFIHF both solve a linear system given its U DUH factorization. LFIHF generally takes more
time and produces a more accurate answer than LFSHF. Each iteration of the iterative refinement algorithm
used by LFIHF calls LFSHF.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Example
A set of linear systems is solved successively. LFTHF is called to factor the coefficient matrix. LFSHF is called
to compute the three solutions for the three right-hand sides. In this case the coefficient matrix is assumed to
be well-conditioned and correctly scaled. Otherwise, it would be better to call LFCHF to perform the factorization, and LFIHF to compute the solutions.
USE LFSHF_INT
USE WRCRN_INT
USE LFTHF_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), I
A(LDA,LDA), B(N,3), X(N,3), FACT(LDA,LDA)
Set values for A and B
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
B = ( 7.0+32.0i -6.0+11.0i -2.0-17.0i )
(-39.0-21.0i -5.5-22.5i 4.0+10.0i )
( 51.0+ 9.0i 16.0+17.0i -2.0+12.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
DATA B/(7.0,32.0), (-39.0,-21.0), (51.0,9.0), (-6.0,11.0),&
(-5.5,-22.5), (16.0,17.0), (-2.0,-17.0), (4.0,10.0),&
(-2.0,12.0)/
Factor A
CALL LFTHF (A, FACT, IPVT)
Solve for the three right-hand sides
DO 10 I=1, 3
CALL LFSHF (FACT, IPVT, B(:,I), X(:,I))
10 CONTINUE
Print results
CALL WRCRN (’X’, X)
END
LFSHF
Chapter 1: Linear Systems
312
Output
X
1
2
3
1
( 2.00, 1.00)
(-10.00, -1.00)
( 3.00, 5.00)
2
( 1.00, 0.00)
( -3.00, -4.00)
( -0.50, 3.00)
(
(
(
3
0.00, -1.00)
0.00, -2.00)
0.00, -3.00)
LFSHF
Chapter 1: Linear Systems
313
LFIHF
more...
Uses iterative refinement to improve the solution of a complex Hermitian system of linear equations.
Required Arguments
A — Complex N by N matrix containing the coefficient matrix of the Hermitian linear system. (Input)
Only the upper triangle of A is referenced.
FACT — Complex N by N matrix containing the factorization of the coefficient matrix A as output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT — Vector of length N containing the pivoting information for the factorization of A as output from
routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution. (Output)
RES — Complex vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFIHF (A, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIHF and D_LFIHF.
FORTRAN 77 Interface
Single:
CALL LFIHF (N, A, LDA, FACT, LDFACT, IPVT, B, X, RES)
Double:
The double precision name is DLFIHF.
LFIHF
Chapter 1: Linear Systems
314
Description
Routine LFIHF computes the solution of a system of linear algebraic equations having a complex Hermitian
indefinite coefficient matrix.
Iterative refinement is performed on the solution vector to improve the accuracy. Usually almost all of the
digits in the solution are accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo a U DUH factorization. This may be done
by calling either LFCHF or LFTHF.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIHF and LFSHF both solve a linear system given its U DUH factorization. LFIHF generally takes more
time and produces a more accurate answer than LFSHF. Each iteration of the iterative refinement algorithm
used by LFIHF calls LFSHF.
Comments
Informational error
Type
Code
Description
3
3
The input matrix is too ill-conditioned for iterative refinement to be effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.2 + 0.2i to the second element.
USE
USE
USE
USE
LFIHF_INT
UMACH_INT
LFCHF_INT
WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), B(N), X(N), FACT(LDA,LDA), RES(N)
Set values for A and B
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
B = ( 7.0+32.0i -39.0-21.0i 51.0+9.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
DATA B/(7.0,32.0), (-39.0,-21.0), (51.0,9.0)/
LFIHF
Chapter 1: Linear Systems
315
!
Set output unit number
CALL UMACH (2, NOUT)
!
!
!
!
Factor A and compute the estimate
of the reciprocal condition number
CALL LFCHF (A, FACT, IPVT, RCOND)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
Solve, then perturb right-hand side
DO 10 I=1, 3
CALL LFIHF (A, FACT, IPVT, B, X, RES)
Print results
WRITE (NOUT,99999) I
CALL WRCRN (’X’, X, 1, N, 1)
CALL WRCRN (’RES’, RES, 1, N, 1)
B(2) = B(2) + (0.2E0, 0.2E0)
10 CONTINUE
!
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
99999 FORMAT (//,’ For problem ’, I1)
END
Output
RCOND < 0.25
L1 Condition number <
For problem 1
5.0
X
(
2.00,
1
1.00)
2
(-10.00, -1.00)
(
3.00,
3
5.00)
RES
1
( 2.384E-07,-4.768E-07)
2
( 0.000E+00,-3.576E-07)
3
(-1.421E-14, 1.421E-14)
For problem 2
X
1
( 2.016, 1.032)
2
(-9.971,-0.971)
3
( 2.973, 4.976)
RES
1
( 2.098E-07,-1.764E-07)
2
( 6.231E-07,-1.518E-07)
3
( 1.272E-07, 4.005E-07)
For problem 3
X
1
( 2.032, 1.064)
2
(-9.941,-0.941)
3
( 2.947, 4.952)
RES
1
( 4.196E-07,-3.529E-07)
2
( 2.925E-07,-3.632E-07)
3
( 2.543E-07, 3.242E-07)
LFIHF
Chapter 1: Linear Systems
316
LFDHF
Computes the determinant of a complex Hermitian matrix given the U DUH factorization of the matrix.
Required Arguments
FACT — Complex N by N matrix containing the factorization of the coefficient matrix A as output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT — Vector of length N containing the pivoting information for the factorization of A as output from
routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDHF (FACT, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDHF and D_LFDHF.
FORTRAN 77 Interface
Single:
CALL LFDHF (N, FACT, LDFACT, IPVT, DET1, DET2)
Double:
The double precision name is DLFDHF.
Description
Routine LFDHF computes the determinant of a complex Hermitian indefinite coefficient matrix. To compute
the determinant, the coefficient matrix must first undergo a U DUH factorization. This may be done by calling either LFCHF or LFTHF since det U = ±1, the formula det A = det U det D det UH = det D is used to
compute the determinant. det D is computed as the product of the determinants of its blocks.
LFDHF is based on the LINPACK routine CSIDI; see Dongarra et al. (1979).
LFDHF
Chapter 1: Linear Systems
317
Example
The determinant is computed for a complex Hermitian 3 × 3 matrix.
USE LFDHF_INT
USE LFTHF_INT
USE UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
DET1, DET2
A(LDA,LDA), FACT(LDA,LDA)
!
!
!
!
!
!
!
!
!
!
Set values for A
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
DATA A/(3.0,0.0), (1.0,1.0), (4.0,0.0), (1.0,-1.0), (2.0,0.0),&
(-5.0,-1.0), (4.0,0.0), (-5.0,1.0), (-2.0,0.0)/
Factor A
CALL LFTHF (A, FACT, IPVT)
Compute the determinant
CALL LFDHF (FACT, IPVT, DET1, DET2)
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant is’, F5.1, ’ * 10**’, F2.0)
END
Output
The determinant is -1.5 * 10**2.
LFDHF
Chapter 1: Linear Systems
318
LSLTR
Solves a real tridiagonal system of linear equations.
Required Arguments
C — Vector of length N containing the subdiagonal of the tridiagonal matrix in C(2) through C(N).
(Input/Output)
On output C is destroyed.
D — Vector of length N containing the diagonal of the tridiagonal matrix. (Input/Output)
On output D is destroyed.
E — Vector of length N containing the superdiagonal of the tridiagonal matrix in E(1) through E(N − 1).
(Input/Output)
On output E is destroyed.
B — Vector of length N containing the right-hand side of the linear system on entry and the solution vector
on return. (Input/Output)
Optional Arguments
N — Order of the tridiagonal matrix. (Input)
Default: N = size (C,1).
FORTRAN 90 Interface
Generic:
CALL LSLTR (C, D, E, B [, …])
Specific:
The specific interface names are S_LSLTR and D_LSLTR.
FORTRAN 77 Interface
Single:
CALL LSLTR (N, C, D, E, B)
Double:
The double precision name is DLSLTR.
Description
Routine LSLTR factors and solves the real tridiagonal linear system Ax = b. LSLTR is intended just for tridiagonal systems. The coefficient matrix does not have to be symmetric. The algorithm is Gaussian elimination
with partial pivoting for numerical stability. See Dongarra (1979), LINPACK subprograms SGTSL/DGTSL,
for details. When computing on vector or parallel computers the cyclic reduction algorithm, LSLCR, should
be considered as an alternative method to solve the system.
Comments
Informational error
Type
Code
Description
4
2
An element along the diagonal became exactly zero during execution.
LSLTR
Chapter 1: Linear Systems
319
Example
A system of n = 4 linear equations is solved.
USE LSLTR_INT
USE WRRRL_INT
!
Declaration of variables
INTEGER
PARAMETER
N
(N=4)
REAL
CHARACTER
B(N), C(N), D(N), E(N)
CLABEL(1)*6, FMT*8, RLABEL(1)*4
!
!
DATA FMT/’(E13.6)’/
DATA CLABEL/’NUMBER’/
DATA RLABEL/’NONE’/
!
!
!
C(*), D(*), E(*), and B(*)
contain the subdiagonal, diagonal,
superdiagonal and right hand side.
DATA C/0.0, 0.0, -4.0, 9.0/, D/6.0, 4.0, -4.0, -9.0/
DATA E/-3.0, 7.0, -8.0, 0.0/, B/48.0, -81.0, -12.0, -144.0/
!
!
CALL LSLTR (C, D, E, B)
!
Output the solution.
CALL WRRRL (’Solution:’, B, RLABEL, CLABEL, 1, N, 1, FMT=FMT)
END
Output
Solution:
1
0.400000E+01
2
-0.800000E+01
-
3
0.700000E+01
4
0.900000E+01
LSLTR
Chapter 1: Linear Systems
320
LSLCR
Computes the L DU factorization of a real tridiagonal matrix A using a cyclic reduction algorithm.
Required Arguments
C — Array of size 2N containing the upper codiagonal of the N by N tridiagonal matrix in the entries
C(1), …, C(N − 1). (Input/Output)
A — Array of size 2N containing the diagonal of the N by N tridiagonal matrix in the entries A(1), …, A(N).
(Input/Output)
B — Array of size 2N containing the lower codiagonal of the N by N tridiagonal matrix in the entries
B(1), …, B(N − 1). (Input/Output)
Y — Array of size 2N containing the right hand side for the system Ax = y in the order Y(1), …, Y(N).
(Input/Output)
The vector x overwrites Y in storage.
U — Array of size 2N of flags that indicate any singularities of A. (Output)
A value U(I) = 1. means that a divide by zero would have occurred during the factoring. Otherwise
U(I) = 0.
IR — Array of integers that determine the sizes of loops performed in the cyclic reduction algorithm.
(Output)
IS — Array of integers that determine the sizes of loops performed in the cyclic reduction algorithm. (Output)
The sizes of IR and IS must be at least log2 (N) + 3.
Optional Arguments
N — Order of the matrix. (Input)
N must be greater than zero
Default: N = size (C,1).
IJOB — Flag to direct the desired factoring or solving step. (Input)
Default: IJOB = 1.
IJOB
Action
1
Factor the matrix A and solve the system Ax = y, where y is
stored in array Y.
2
Do the solve step only. Use y from array Y. (The factoring step
has already been done.)
3
Factor the matrix A but do not solve a system.
4, 5, 6
Same meaning as with the value IJOB = 3. For efficiency, no
error checking is done on the validity of any input value.
FORTRAN 90 Interface
Generic:
CALL LSLCR (C, A, B, Y, U, IR, IS [, …])
Specific:
The specific interface names are S_LSLCR and D_LSLCR.
LSLCR
Chapter 1: Linear Systems
321
FORTRAN 77 Interface
Single:
CALL LSLCR (N, C, A, B, IJOB, Y, U, IR, IS)
Double:
The double precision name is DLSLCR.
Description
Routine LSLCR factors and solves the real tridiagonal linear system Ax = y. The matrix is decomposed in the
form A = L DU, where L is unit lower triangular, U is unit upper triangular, and D is diagonal. The algorithm
used for the factorization is effectively that described in Kershaw (1982). More details, tests and experiments
are reported in Hanson (1990).
LSLCR is intended just for tridiagonal systems. The coefficient matrix does not have to be symmetric. The
algorithm amounts to Gaussian elimination, with no pivoting for numerical stability, on the matrix whose
rows and columns are permuted to a new order. See Hanson (1990) for details. The expectation is that LSLCR
will outperform either LSLTR or LSLPB on vector or parallel computers. Its performance may be inferior for
small values of n, on scalar computers, or high-performance computers with non-optimizing compilers.
Example
A system of n = 1000 linear equations is solved. The coefficient matrix is the symmetric matrix of the second
difference operation, and the right-hand-side vector y is the first column of the identity matrix. Note that
an,n= 1. The solution vector will be the first column of the inverse matrix of A. Then a new system is solved
where y is now the last column of the identity matrix. The solution vector for this system will be the last column of the inverse matrix.
USE LSLCR_INT
USE UMACH_INT
!
INTEGER
PARAMETER
Declare variables
LP, N, N2
(LP=12, N=1000, N2=2*N)
INTEGER
REAL
I, IJOB, IR(LP), IS(LP), NOUT
A(N2), B(N2), C(N2), U(N2), Y1(N2), Y2(N2)
!
!
!
Define matrix entries:
DO 10 I=1, N - 1
C(I)
= -1.E0
A(I)
= 2.E0
B(I)
= -1.E0
Y1(I+1) = 0.E0
Y2(I)
= 0.E0
10 CONTINUE
A(N) = 1.E0
Y1(1) = 1.E0
Y2(N) = 1.E0
!
!
!
Obtain decomposition of matrix and
solve the first system:
IJOB = 1
CALL LSLCR (C, A, B, Y1, U, IR, IS, IJOB=IJOB)
LSLCR
Chapter 1: Linear Systems
322
!
!
!
Solve the second system with the
decomposition ready:
IJOB = 2
CALL LSLCR (C, A, B, Y2, U, IR, IS, IJOB=IJOB)
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ’ The value of n is: ’, N
WRITE (NOUT,*) ’ Elements 1, n of inverse matrix columns 1 ’//&
’and
n:’, Y1(1), Y2(N)
END
Output
The value of n is:
1000
Elements 1, n of inverse matrix columns 1 and
n:
1.00000
LSLCR
1000.000
Chapter 1: Linear Systems
323
LSARB
more...
Solves a real system of linear equations in band storage mode with iterative refinement.
Required Arguments
A — (NLCA + NUCA + 1) by N array containing the N by N banded coefficient matrix in band storage mode.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
CALL LSARB (A, NLCA, NUCA, B, X [, …])
Specific:
The specific interface names are S_LSARB and D_LSARB.
FORTRAN 77 Interface
Single:
CALL LSARB (N, A, LDA, NLCA, NUCA, B, IPATH, X)
Double:
The double precision name is DLSARB.
LSARB
Chapter 1: Linear Systems
324
Description
Routine LSARB solves a system of linear algebraic equations having a real banded coefficient matrix. It first
uses the routine LFCRB to compute an LU factorization of the coefficient matrix and to estimate the condition
number of the matrix. The solution of the linear system is then found using the iterative refinement routine
LFIRB.
LSARB fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A is singular or very close to a singular
matrix.
If the estimated condition number is greater than 1∕ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSARB solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ARB/DL2ARB. The reference is:
CALL L2ARB (N, A, LDA, NLCA, NUCA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — Work vector of length (2 * NLCA + NUCA + 1) × N containing the LU factorization of A on
output.
IPVT — Work vector of length N containing the pivoting information for the LU factorization of A
on output.
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
3.Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ARB the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSARB. Additional memory allocation for FACT and option value restoration are
done automatically in LSARB. Users directly calling L2ARB can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSARB or L2ARB.
Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSARB temporarily replaces IVAL(2) by IVAL(1). The routine L2CRB computes the condition
number if IVAL(2) = 2. Otherwise L2CRB skips this computation. LSARB restores the option.
Default values for the option are IVAL(*) = 1, 2.
LSARB
Chapter 1: Linear Systems
325
Example
A system of four linear equations is solved. The coefficient matrix has real banded form with 1 upper and 1
lower codiagonal. The right-hand-side vector b has four elements.
USE LSARB_INT
USE WRRRN_INT
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
A = ( 0.0
( 2.0
( -3.0
B = (
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
1.0
11.0
-2.0)
3.0
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/3.0, 1.0, 11.0, -2.0/
!
!
CALL LSARB (A, NLCA, NUCA, B, X)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
!
END
Output
X
1
2.000
2
1.000
3
-3.000
4
4.000
LSARB
Chapter 1: Linear Systems
326
LSLRB
more...
more...
Solves a real system of linear equations in band storage mode without iterative refinement.
Required Arguments
A — (NLCA + NUCA + 1) by N array containing the N by N banded coefficient matrix in band storage mode.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLRB (A, NLCA, NUCA, B, X [, …])
Specific:
The specific interface names are S_LSLRB and D_LSLRB.
FORTRAN 77 Interface
Single:
CALL LSLRB (N, A, LDA, NLCA, NUCA, B, IPATH, X)
Double:
The double precision name is DLSLRB.
ScaLAPACK Interface
Generic:
CALL LSLRB (A0, NLCA, NUCA, B0, X0 [, …])
LSLRB
Chapter 1: Linear Systems
327
Specific:
The specific interface names are S_LSLRB and D_LSLRB.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LSLRB solves a system of linear algebraic equations having a real banded coefficient matrix. It first
uses the routine LFCRB to compute an LU factorization of the coefficient matrix and to estimate the condition
number of the matrix. The solution of the linear system is then found using LFSRB. LSLRB fails if U, the
upper triangular part of the factorization, has a zero diagonal element. This occurs only if A is singular or
very close to a singular matrix. If the estimated condition number is greater than 1/ɛ (where ɛ is machine
precision), a warning error is issued. This indicates that very small changes in A can cause very large changes
in the solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSARB be
used.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LRB/DL2LRB. The reference is:
CALL L2LRB (N, A, LDA, NLCA, NUCA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — (2 × NLCA + NUCA + 1) × N containing the LU factorization of A on output. If A is not
needed, A can share the first (NLCA + NUCA + 1) * N storage locations with FACT.
IPVT — Work vector of length N containing the pivoting information for the LU factorization of A on
output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LRB the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLRB. Additional memory allocation for FACT and option value restoration are
done automatically in LSLRB. Users directly calling L2LRB can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLRB or L2LRB.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
LSLRB
Chapter 1: Linear Systems
328
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLRB temporarily replaces IVAL(2) by IVAL(1). The routine L2CRB computes the condition
number if IVAL(2) = 2. Otherwise L2CRB skips this computation. LSLRB restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — (2*NLCA + 2*NUCA+1) by MXCOL local matrix containing the local portions of the distributed matrix
A. A contains the N by N banded coefficient matrix in band storage mode. (Input)
B0 — Local vector of length MXCOL containing the local portions of the distributed vector B. B contains
the right-hand side of the linear system. (Input)
X0 — Local vector of length MXCOL containing the local portions of the distributed vector X. X contains
the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXCOL can be obtained through a call to SCALAPACK_GETDIM (see Utilities)
after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example below.
Examples
Example
A system of four linear equations is solved. The coefficient matrix has real banded form with 1 upper and 1
lower codiagonal. The right-hand-side vector b has four elements.
USE LSLRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
B = (
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
1.0
11.0
-2.0)
3.0
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/3.0, 1.0, 11.0, -2.0/
!
!
CALL LSLRB (A, NLCA, NUCA, B, X)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
!
END
LSLRB
Chapter 1: Linear Systems
329
Output
X
1
2.000
2
1.000
3
-3.000
4
4.000
ScaLAPACK Example
The same system of four linear equations is solved as a distributed computing example. The coefficient
matrix has real banded form with 1 upper and 1 lower codiagonal. The right-hand-side vector b has four elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and
unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLRB_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, M, N, NLCA, NUCA, NRA, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER (LDA=3, N=6, NLCA=1, NUCA=1)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 0.0,
0.0, -3.0,
0.0, -1.0, -3.0/)
A(2,:) = (/ 10.0, 10.0, 15.0, 10.0, 1.0, 6.0/)
A(3,:) = (/ 0.0,
0.0,
0.0, -5.0, 0.0, 0.0/)!
B
= (/ 10.0,
7.0,
ENDIF
NRA = NLCA + NUCA + 1
M = 2*NLCA + 2*NUCA + 1
!
!
!
!
!
!
!
!
45.0,
33.0, -34.0, 31.0/)
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(M, N, .FALSE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(M, N, MP_MB, MP_NB, MXLDA, MXCOL)
Reset MXLDA to M
MXLDA = M
Set up the array descriptors
CALL DESCINIT(DESCA,NRA,N,MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, 1, N, 1, MP_NB, 0, 0, MP_ICTXT, 1, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXCOL), X0(MXCOL))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
LSLRB
Chapter 1: Linear Systems
330
!
!
!
!
!
!
!
!
!
CALL SCALAPACK_MAP(B, DESCX, B0, 1, .FALSE.)
Solve the system of equations
CALL LSLRB (A0, NLCA, NUCA, B0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X, 1, .FALSE.)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRRRN (’X’, X, 1, N, 1)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, X)
DEALLOCATE(A0, B0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
X
1
1.000
2
1.600
3
3.000
4
2.900
5
-4.000
6
5.167
LSLRB
Chapter 1: Linear Systems
331
LFCRB
more...
Computes the LU factorization of a real matrix in band storage mode and estimate its L1 condition number.
Required Arguments
A — (NLCA + NUCA + 1) by N array containing the N by N matrix in band storage mode to be factored.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A. (Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT.
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCRB (A, NLCA, NUCA, FACT, IPVT, RCOND [, …])
Specific:
The specific interface names are S_LFCRB and D_LFCRB.
FORTRAN 77 Interface
Single:
CALL LFCRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCRB.
LFCRB
Chapter 1: Linear Systems
332
Description
Routine LFCRB performs an LU factorization of a real banded coefficient matrix. It also estimates the condition number of the matrix. The LU factorization is done using scaled partial pivoting. Scaled partial pivoting
differs from partial pivoting in that the pivoting strategy is the same as if each row were scaled to have the
same ∞-norm.
The L1 condition number of the matrix A is defined to be
κ(A) = ∥A∥1∥A-1∥1
Since it is expensive to compute
∥A-1∥1
the condition number is only estimated. The estimation algorithm is the same as used by LINPACK and is
described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LSCRB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A is singular or very close to a singular matrix. The LU factors are returned in a form that is compatible with routines LFIRB, LFSRB and LFDRB. To solve systems of equations with multiple right-hand-side
vectors, use LFCRB followed by either LFIRB or LFSRB called once for each right-hand side. The routine
LFDRB can be called to compute the determinant of the coefficient matrix after LFCRB has performed the
factorization.
Let F be the matrix FACT, let ml = NLCA and let mu = NUCA. The first ml+ mu + 1 rows of F contain the triangular matrix U in band storage form. The lower ml rows of F contain the multipliers needed to reconstruct L-1 .
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CRB/DL2CRB. The reference is:
CALL L2CRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is singular.
LFCRB
Chapter 1: Linear Systems
333
Example
The inverse of a 4 × 4 band matrix with one upper and one lower codiagonal is computed. LFCRB is called to
factor the matrix and to check for singularity or ill-conditioning. LFIRB is called to determine the columns of
the inverse.
USE
USE
USE
USE
LFCRB_INT
UMACH_INT
LFIRB_INT
WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RCOND, RJ(N), RES(N)
Set values for A in band form
A = ( 0.0 -1.0 -2.0
2.0)
( 2.0
1.0 -1.0
1.0)
( -3.0
0.0
2.0
0.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
!
!
!
!
!
!
!
!
!
!
!
CALL LFCRB (A, NLCA, NUCA, FACT, IPVT, RCOND)
Print the reciprocal condition number
and the L1 condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFIRB
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIRB (A, NLCA, NUCA, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = 0.0E0
10 CONTINUE
Print results
CALL WRRRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < .07
L1 Condition number = 25.0
LFCRB
Chapter 1: Linear Systems
334
1
2
3
4
1
-1.000
-3.000
0.000
0.000
AINV
2
3
-1.000
0.400
-2.000
0.800
0.000 -0.200
0.000
0.400
4
-0.800
-1.600
0.400
0.200
LFCRB
Chapter 1: Linear Systems
335
LFTRB
more...
Computes the LU factorization of a real matrix in band storage mode.
Required Arguments
A — (NLCA + NUCA + 1) by N array containing the N by N matrix in band storage mode to be factored.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A. (Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT.
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTRB (A, NLCA, NUCA, FACT [, …])
Specific:
The specific interface names are S_LFTRB and D_LFTRB.
FORTRAN 77 Interface
Single:
CALL LFTRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTRB.
LFTRB
Chapter 1: Linear Systems
336
Description
Routine LFTRB performs an LU factorization of a real banded coefficient matrix using Gaussian elimination
with partial pivoting. A failure occurs if U, the upper triangular factor, has a zero diagonal element. This can
happen if A is close to a singular matrix. The LU factors are returned in a form that is compatible with routines LFIRB, LFSRB and LFDRB. To solve systems of equations with multiple right-hand-side vectors, use
LFTRB followed by either LFIRB or LFSRB called once for each right-hand side. The routine LFDRB can be
called to compute the determinant of the coefficient matrix after LFTRB has performed the factorization
Let ml = NLCA, and let mu = NUCA. The first ml + mu + 1 rows of FACT contain the triangular matrix U in band
storage form. The next ml rows of FACT contain the multipliers needed to produce L.
The routine LFTRB is based on the the blocked LU factorization algorithm for banded linear systems given in
Du Croz, et al. (1990). Level-3 BLAS invocations were replaced by in-line loops. The blocking factor nb has
the default value 1 in LFTRB. It can be reset to any positive value not exceeding 32.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2TRB/DL2TRB. The reference is:
CALL L2TRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, WK)
The additional argument is:
WK — Work vector of length N used for scaling.
2
3.
Informational error
Type
Code
Description
4
2
The input matrix is singular.
Utilities with Chapter 11 Options Manager
21
The performance of the LU factorization may improve on high-performance computers if the
blocking factor, NB, is increased. The current version of the routine allows NB to be reset to a
value no larger than 32. Default value is NB = 1.
Example
A linear system with multiple right-hand sides is solved. LFTRB is called to factor the coefficient matrix.
LFSRB is called to compute the two solutions for the two right-hand sides. In this case the coefficient matrix
is assumed to be appropriately scaled. Otherwise, it may be better to call routine LFCRB to perform the factorization, and LFIRB to compute the solutions.
USE LFTRB_INT
USE LFSRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N,2), FACT(LDFACT,N), X(N,2)
Set values for A in band form, and B
LFTRB
Chapter 1: Linear Systems
337
!
!
!
!
!
!
!
!
!
!
!
!
A = ( 0.0
( 2.0
( -3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
B = ( 12.0 -17.0)
(-19.0 23.0)
( 6.0
5.0)
( 8.0
5.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/12.0, -19.0, 6.0, 8.0, -17.0, 23.0, 5.0, 5.0/
Compute factorization
CALL LFTRB (A, NLCA, NUCA, FACT, IPVT)
Solve for the two right-hand sides
DO 10 J=1, 2
CALL LFSRB (FACT, NLCA, NUCA, IPVT, B(:,J), X(:,J))
10 CONTINUE
Print results
CALL WRRRN (’X’, X)
!
END
Output
1
2
3
4
X
1
3.000
-6.000
2.000
4.000
2
-8.000
1.000
1.000
3.000
LFTRB
Chapter 1: Linear Systems
338
LFSRB
more...
Solves a real system of linear equations given the LU factorization of the coefficient matrix in band storage
mode.
Required Arguments
FACT — (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the coefficient matrix A as
output from routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSRB (FACT, NLCA, NUCA, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSRB and D_LFSRB.
FORTRAN 77 Interface
Single:
CALL LFSRB (N, FACT, LDFACT, NLCA, NUCA, IPVT, B, IPATH, X)
Double:
The double precision name is DLFSRB.
LFSRB
Chapter 1: Linear Systems
339
Description
Routine LFSRB computes the solution of a system of linear algebraic equations having a real banded coefficient matrix. To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCRB or LFTRB. The solution to Ax = b is found by solving the banded triangular
systems Ly = b and Ux = y. The forward elimination step consists of solving the system Ly = b by applying the
same permutations and elimination operations to b that were applied to the columns of A in the factorization
routine. The backward substitution step consists of solving the banded triangular system Ux = y for x.
LFSRB and LFIRB both solve a linear system given its LU factorization. LFIRB generally takes more time
and produces a more accurate answer than LFSRB. Each iteration of the iterative refinement algorithm used
by LFIRB calls LFSRB.
The underlying code is based on either LINPACK or LAPACK code depending upon which supporting
libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
Example
The inverse is computed for a real banded 4 × 4 matrix with one upper and one lower codiagonal. The input
matrix is assumed to be well-conditioned, hence LFTRB is used rather than LFCRB.
USE LFSRB_INT
USE LFTRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
Set values for A in band form
A = ( 0.0 -1.0 -2.0
2.0)
( 2.0
1.0 -1.0
1.0)
( -3.0
0.0
2.0
0.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
!
!
!
!
!
!
!
!
CALL LFTRB (A, NLCA, NUCA, FACT, IPVT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSRB
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSRB (FACT, NLCA, NUCA, IPVT, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
LFSRB
Chapter 1: Linear Systems
340
!
Print results
CALL WRRRN (’AINV’, AINV)
!
END
Output
1
2
3
4
1
-1.000
-3.000
0.000
0.000
AINV
2
3
-1.000
0.400
-2.000
0.800
0.000 -0.200
0.000
0.400
4
-0.800
-1.600
0.400
0.200
LFSRB
Chapter 1: Linear Systems
341
LFIRB
more...
Uses iterative refinement to improve the solution of a real system of linear equations in band storage mode.
Required Arguments
A — (NUCA + NLCA + 1) by N array containing the N by N banded coefficient matrix in band storage mode.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A as output from
routines LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
RES — Vector of length N containing the residual vector at the improved solution . (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
CALL LFIRB (A, NLCA, NUCA, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIRB and D_LFIRB.
LFIRB
Chapter 1: Linear Systems
342
FORTRAN 77 Interface
Single:
CALL LFIRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, B, IPATH, X, RES)
Double:
The double precision name is DLFIRB.
Description
Routine LFIRB computes the solution of a system of linear algebraic equations having a real banded coefficient matrix. Iterative refinement is performed on the solution vector to improve the accuracy. Usually
almost all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCRB or LFTRB.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIRB and LFSRB both solve a linear system given its LU factorization. LFIRB generally takes more time
and produces a more accurate answer than LFSRB. Each iteration of the iterative refinement algorithm used
by LFIRB calls LFSRB.
Comments
Informational error
Type
Code
Description
3
2
The input matrix is too ill-conditioned for iterative refinement to be effective
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIRB_INT
LFCRB_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N), FACT(LDFACT,N), RCOND, RES(N), X(N)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
B = (
3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
5.0
7.0
-9.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
LFIRB
Chapter 1: Linear Systems
343
DATA B/3.0, 5.0, 7.0, -9.0/
!
!
!
!
!
CALL LFCRB (A, NLCA, NUCA, FACT, IPVT, RCOND)
Print the reciprocal condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Solve the three systems
DO 10 J=1, 3
CALL LFIRB (A, NLCA, NUCA, FACT, IPVT, B, X, RES)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
Perturb B by adding 0.5 to B(2)
B(2) = B(2) + 0.5E0
10 CONTINUE
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND < .07
L1 Condition number = 25.0
X
1
2
3
4
2.000
1.000 -5.000
1.000
X
1
1.500
2
0.000
1
1.000
2
-1.000
3
-5.000
4
1.000
X
3
-5.000
4
1.000
LFIRB
Chapter 1: Linear Systems
344
LFDRB
Computes the determinant of a real matrix in band storage mode given the LU factorization of the matrix.
Required Arguments
FACT — (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A as output from
routine LFTRB/DLFTRB or LFCRB/DLFCRB. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization as output from
routine LFTRB/DLFTRB or LFCRB/DLFCRB. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDRB (FACT, NLCA, NUCA, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDRB and D_LFDRB.
FORTRAN 77 Interface
Single:
CALL LFDRB (N, FACT, LDFACT, NLCA, NUCA, IPVT, DET1, DET2)
Double:
The double precision name is DLFDRB.
Description
Routine LFDRB computes the determinant of a real banded coefficient matrix. To compute the determinant,
the coefficient matrix must first undergo an LU factorization. This may be done by calling either LFCRB or
LFTRB. The formula det A = det L det U is used to compute the determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
LFDRB
Chapter 1: Linear Systems
345
GHW8
š
1
8 LL
L (The matrix U is stored in the upper NUCA + NLCA + 1 rows of FACT as a banded matrix.) Since L is the product of triangular matrices with unit diagonals and of permutation matrices, det L = (-1)k, where k is the
number of pivoting interchanges.
LFDRB is based on the LINPACK routine CGBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a real banded 4 × 4 matrix with one upper and one lower codiagonal.
USE LFDRB_INT
USE LFTRB_INT
USE UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), DET1, DET2, FACT(LDFACT,N)
Set values for A in band
A = ( 0.0 -1.0 -2.0
( 2.0
1.0 -1.0
( -3.0
0.0
2.0
form
2.0)
1.0)
0.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
!
CALL LFTRB (A, NLCA, NUCA, FACT, IPVT)
Compute the determinant
CALL LFDRB (FACT, NLCA, NUCA, IPVT, DET1, DET2)
!
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT (’ The determinant of A is ’, F6.3, ’ * 10**’, F2.0)
END
!
Output
The determinant of A is
5.000 * 10**0.
LFDRB
Chapter 1: Linear Systems
346
LSAQS
Solves a real symmetric positive definite system of linear equations in band symmetric storage mode with
iterative refinement.
Required Arguments
A — NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in band symmetric storage mode. (Input)
NCODA — Number of upper codiagonals of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSAQS (A, NCODA, B, X [, …])
Specific:
The specific interface names are S_LSAQS and D_LSAQS.
FORTRAN 77 Interface
Single:
CALL LSAQS (N, A, LDA, NCODA, B, X)
Double:
The double precision name is DLSAQS.
Description
Routine LSAQS solves a system of linear algebraic equations having a real symmetric positive definite band
coefficient matrix. It first uses the routine LFCQS to compute an RTR Cholesky factorization of the coefficient
matrix and to estimate the condition number of the matrix. R is an upper triangular band matrix. The solution of the linear system is then found using the iterative refinement routine LFIQS.
LSAQS fails if any submatrix of R is not positive definite, if R has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A is very close to a singular matrix or to a
matrix which is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSAQS solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
LSAQS
Chapter 1: Linear Systems
347
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2AQS/DL2AQS. The reference is:
CALL L2AQS (N, A, LDA, NCODA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — Work vector of length NCODA + 1 by N containing the RT R factorization of A in band symmetric storage form on output.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is not positive definite.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2AQS the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSAQS. Additional memory allocation for FACT and option value restoration are
done automatically in LSAQS.
Users directly calling L2AQS can allocate additional space for FACT and set IVAL(3) and IVAL(4)
so that memory bank conflicts no longer cause inefficiencies. There is no requirement that users
change existing applications that use LSAQS or L2AQS. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSAQS temporarily replaces IVAL(2) by IVAL(1). The routine L2CQS computes the condition
number if IVAL(2) = 2. Otherwise L2CQS skips this computation. LSAQS restores the option.
Default values for the option are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite band form, and
the right-hand-side vector b has four elements.
USE LSAQS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NCODA
(LDA=3, N=4, NCODA=2)
A(LDA,N), B(N), X(N)
Set values for A in band symmetric form, and B
A = (
(
(
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = (
6.0 -11.0 -11.0
19.0 )
LSAQS
Chapter 1: Linear Systems
348
!
!
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
DATA B/6.0, -11.0, -11.0, 19.0/
Solve A*X = B
CALL LSAQS (A, NCODA, B, X)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
!
END
Output
X
1
4.000
2
-6.000
3
2.000
4
9.000
LSAQS
Chapter 1: Linear Systems
349
LSLQS
Solves a real symmetric positive definite system of linear equations in band symmetric storage mode without
iterative refinement.
Required Arguments
A — NCODA + 1 by N array containing the N by N positive definite band symmetric coefficient matrix in
band symmetric storage mode. (Input)
NCODA — Number of upper codiagonals of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLQS (A, NCODA, B, X [, …])
Specific:
The specific interface names are S_LSLQS and D_LSLQS.
FORTRAN 77 Interface
Single:
CALL LSLQS (N, A, LDA, NCODA, B, X)
Double:
The double precision name is DLSLQS.
Description
Routine LSLQS solves a system of linear algebraic equations having a real symmetric positive definite band
coefficient matrix. It first uses the routine LFCQS to compute an RTR Cholesky factorization of the coefficient
matrix and to estimate the condition number of the matrix. R is an upper triangular band matrix. The solution of the linear system is then found using the routine LFSQS.
LSLQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSAQS be used.
LSLQS
Chapter 1: Linear Systems
350
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LQS/DL2LQS. The reference is:
CALL L2LQS (N, A, LDA, NCODA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — NCODA + 1 by N work array containing the RTR factorization of A in band symmetric form
on output. If A is not needed, A and FACT can share the same storage locations.
WK — Work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
1
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is not positive definite.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LQS the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLQS. Additional memory allocation for FACT and option value restoration are
done automatically in LSLQS. Users directly calling L2LQS can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLQS or L2LQS.
Default values for the option are IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLQS temporarily replaces IVAL(2) by IVAL(1). The routine L2CQS computes the condition
number if IVAL(2) = 2. Otherwise L2CQS skips this computation. LSLQS restores the option.
Default values for the option are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite band form and
the right-hand-side vector b has four elements.
USE LSLQS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NCODA
(LDA=3, N=4, NCODA=2)
A(LDA,N), B(N), X(N)
Set values for A in band symmetric form, and B
A = (
(
(
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = (
6.0 -11.0 -11.0
19.0 )
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
LSLQS
Chapter 1: Linear Systems
351
!
DATA B/6.0, -11.0, -11.0, 19.0/
Solve A*X = B
CALL LSLQS (A, NCODA, B, X)
!
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
Output
1
4.000
X
2
-6.000
3
2.000
4
9.000
LSLQS
Chapter 1: Linear Systems
352
LSLPB
Computes the RTDR Cholesky factorization of a real symmetric positive definite matrix A in codiagonal band
symmetric storage mode. Solve a system Ax = b.
Required Arguments
A — Array containing the N by N positive definite band coefficient matrix and right hand side in codiagonal band symmetric storage mode. (Input/Output)
The number of array columns must be at least NCODA + 2. The number of column is not an input to this
subprogram.
On output, A contains the solution and factors. See Comments section for details.
NCODA — Number of upper codiagonals of matrix A. (Input)
Must satisfy NCODA ≥ 0 and NCODA < N.
U — Array of flags that indicate any singularities of A, namely loss of positive-definiteness of a leading
minor. (Output)
A value U(I) = 0. means that the leading minor of dimension I is not positive-definite. Otherwise,
U(I) = 1.
Optional Arguments
N — Order of the matrix. (Input)
Must satisfy N > 0.
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Must satisfy LDA ≥ N + NCODA.
Default: LDA = size (A,1).
IJOB — Flag to direct the desired factorization or solving step. (Input)
Default: IJOB = 1.
IJOB
Meaning
1
factor the matrix A and solve the system Ax = b, where b is stored in column NCODA + 2 of array
A. The vector x overwrites b in storage.
2
solve step only. Use b as column NCODA + 2 of A. (The factorization step has already been done.)
The vector x overwrites b in storage.
3
factor the matrix A but do not solve a system.
4,5,6 same meaning as with the value IJOB - 3. For efficiency, no error checking is done on values
LDA, N, NCODA, and U(*).
FORTRAN 90 Interface
Generic:
CALL LSLPB (A, NCODA, U [, …])
Specific:
The specific interface names are S_LSLPB and D_LSLPB.
LSLPB
Chapter 1: Linear Systems
353
FORTRAN 77 Interface
Single:
CALL LSLPB (N, A, LDA, NCODA, IJOB, U)
Double:
The double precision name is DLSLPB.
Description
Routine LSLPB factors and solves the symmetric positive definite banded linear system Ax = b. The matrix is
factored so that A = RTDR, where R is unit upper triangular and D is diagonal. The reciprocals of the diagonal entries of D are computed and saved to make the solving step more efficient. Errors will occur if D has a
non-positive diagonal element. Such events occur only if A is very close to a singular matrix or is not positive
definite.
LSLPB is efficient for problems with a small band width. The particular cases NCODA = 0, 1, 2 are done with
special loops within the code. These cases will give good performance. See Hanson (1989) for details. When
solving tridiagonal systems, NCODA = 1, the cyclic reduction code LSLCR should be considered as an alternative. The expectation is that LSLCR will outperform LSLPB on vector or parallel computers. It may be inferior
on scalar computers or even parallel computers with non-optimizing compilers.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LPB/DL2LPB. The reference is:
CALL L2LPB (N, A, LDA, NCODA, IJOB, U, WK)
The additional argument is:
WK — Work vector of length NCODA.
2.
If IJOB=1, 3, 4, or 6, A contains the factors R and D on output. These are stored in codiagonal band
symmetric storage mode. Column 1 of A contains the reciprocal of diagonal matrix D. Columns 2
through NCODA+1 contain the upper diagonal values for upper unit diagonal matrix R. If
IJOB=1,2, 4, or 5, the last column of A contains the solution on output, replacing b.
3.
Informational error
Type
Code
Description
4
2
The input matrix is not positive definite.
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite codiagonal band
form and the right-hand-side vector b has four elements.
USE LSLPB_INT
USE WRRRN_INT
!
Declare variables
INTEGER LDA, N, NCODA
PARAMETER (N=4, NCODA=2, LDA=N+NCODA)
!
INTEGER IJOB
REAL A(LDA,NCODA+2), U(N)
REAL R(N,N), RT(N,N), D(N,N), WK(N,N), AA(N,N)!!
LSLPB
Chapter 1: Linear Systems
354
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A and right side in
codiagonal band symmetric form:
A
=
( *
( *
(2.0
(4.0
(7.0
(3.0
*
*
*
0.0
2.0
-1.0
*
*
*
*
-1.0
1.0
* )
* )
6.0)
-11.0)
-11.0)
19.0)
DATA ((A(I+NCODA,J),I=1,N),J=1,NCODA+2)/2.0, 4.0, 7.0, 3.0, 0.0,&
0.0, 2.0, -1.0, 0.0, 0.0, -1.0, 1.0, 6.0, -11.0, -11.0,&
19.0/
DATA R/16*0.0/, D/16*0.0/, RT/16*0.0/
Factor and solve A*x = b.
CALL LSLPB(A, NCODA, U)
Print results
CALL WRRRN ('X', A((NCODA+1):,(NCODA+2):), NRA=1, NCA=N, LDA=1)
END
Output
X
1
4.000
2
-6.000
3
2.000
4
9.000
LSLPB
Chapter 1: Linear Systems
355
LFCQS
Computes the RT R Cholesky factorization of a real symmetric positive definite matrix in band symmetric
storage mode and estimate its L1condition number.
Required Arguments
A — NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in band symmetric storage mode to be factored. (Input)
NCODA — Number of upper codiagonals of A. (Input)
FACT — NCODA + 1 by N array containing the RTR factorization of the matrix A in band symmetric form.
(Output)
If A is not needed, A and FACT can share the same storage locations.
RCOND — Scalar containing an estimate of the reciprocal of the L1condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCQS (A, NCODA, FACT, RCOND [, …])
Specific:
The specific interface names are S_LFCQS and D_LFCQS.
FORTRAN 77 Interface
Single:
CALL LFCQS (N, A, LDA, NCODA, FACT, LDFACT, RCOND)
Double:
The double precision name is DLFCQS.
Description
Routine LFCQS computes an RTR Cholesky factorization and estimates the condition number of a real symmetric positive definite band coefficient matrix. R is an upper triangular band matrix.
The L1condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
LFCQS
Chapter 1: Linear Systems
356
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RTR factors are returned in a form that is compatible with routines LFIQS, LFSQS and LFDQS. To solve
systems of equations with multiple right-hand-side vectors, use LFCQS followed by either LFIQS or LFSQS
called once for each right-hand side. The routine LFDQS can be called to compute the determinant of the coefficient matrix after LFCQS has performed the factorization.
LFCQS is based on the LINPACK routine SPBCO; see Dongarra et al. (1979).
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CQS/DL2CQS. The reference is:
CALL L2CQS (N, A, LDA, NCODA, FACT, LDFACT, RCOND, WK)
The additional argument is:
WK — Work vector of length N.
2.
Informational errors
Type
Code
Description
3
3
The input matrix is algorithmically singular.
4
2
The input matrix is not positive definite.
Example
The inverse of a 4 × 4 symmetric positive definite band matrix with one codiagonal is computed. LFCQS is
called to factor the matrix and to check for nonpositive definiteness or ill-conditioning. LFIQS is called to
determine the columns of the inverse.
USE
USE
USE
USE
LFCQS_INT
LFIQS_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), AINV(N,N), RCOND, FACT(LDFACT,N),&
RES(N), RJ(N)
Set values for A in band symmetric form
A = (
(
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
DATA A/0.0, 2.0, 1.0, 2.5, 1.0, 2.5, 1.0, 2.0/
Factor the matrix A
CALL LFCQS (A, NCODA, FACT, RCOND)
LFCQS
Chapter 1: Linear Systems
357
!
!
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
!
!
!
!
!
RJ is the J-th column of the identity
matrix so the following LFIQS
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIQS (A, NCODA, FACT, RJ, AINV(:,J), RES)
RJ(J) = 0.0E0
10 CONTINUE
!
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
CALL WRRRN (’AINV’, AINV)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
Output
RCOND = 0.160
L1 Condition number = 6.239
AINV
1
2
3
1
0.6667 -0.3333
0.1667
2 -0.3333
0.6667 -0.3333
3
0.1667 -0.3333
0.6667
4 -0.0833
0.1667 -0.3333
4
-0.0833
0.1667
-0.3333
0.6667
LFCQS
Chapter 1: Linear Systems
358
LFTQS
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix in band symmetric
storage mode.
Required Arguments
A — NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in band symmetric storage mode to be factored. (Input)
NCODA — Number of upper codiagonals of A. (Input)
FACT — NCODA + 1 by N array containing the RT R factorization of the matrix A. (Output)
If A s not needed, A and FACT can share the same storage locations.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTQS (A, NCODA, FACT [, …])
Specific:
The specific interface names are S_LFTQS and D_LFTQS.
FORTRAN 77 Interface
Single:
CALL LFTQS (N, A, LDA, NCODA, FACT, LDFACT)
Double:
The double precision name is DLFTQS.
Description
Routine LFTQS computes an RT R Cholesky factorization of a real symmetric positive definite band coefficient matrix. R is an upper triangular band matrix.
LFTQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A is very close to a singular matrix or to a matrix which is not positive definite.
The RT R factors are returned in a form that is compatible with routines LFIQS, LFSQS and LFDQS. To solve
systems of equations with multiple right hand-side vectors, use LFTQS followed by either LFIQS or LFSQS
called once for each right-hand side. The routine LFDQS can be called to compute the determinant of the coefficient matrix after LFTQS has performed the factorization.
LFTQS
Chapter 1: Linear Systems
359
LFTQS is based on the LINPACK routine CPBFA; see Dongarra et al. (1979).
Comments
Informational error
Type
Code
Description
4
2
The input matrix is not positive definite.
Example
The inverse of a 3 × 3 matrix is computed. LFTQS is called to factor the matrix and to check for nonpositive
definiteness. LFSQS is called to determine the columns of the inverse.
USE LFTQS_INT
USE WRRRN_INT
USE LFSQS_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A in band symmetric form
A = (
(
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
DATA A/0.0, 2.0, 1.0, 2.5, 1.0, 2.5, 1.0, 2.0/
Factor the matrix A
CALL LFTQS (A, NCODA, FACT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSQS
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSQS (FACT, NCODA, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
Print the results
CALL WRRRN (’AINV’, AINV, ITRING=1)
END
Output
1
1
0.6667
AINV
2
3
-0.3333
0.1667
4
-0.0833
LFTQS
Chapter 1: Linear Systems
360
2
3
4
0.6667
-0.3333
0.6667
0.1667
-0.3333
0.6667
LFTQS
Chapter 1: Linear Systems
361
LFSQS
Solves a real symmetric positive definite system of linear equations given the factorization of the coefficient
matrix in band symmetric storage mode.
Required Arguments
FACT — NCODA + 1 by N array containing the RT R factorization of the positive definite band matrix A in
band symmetric storage mode as output from subroutine LFCQS/DLFCQS or LFTQS/DLFTQS. (Input)
NCODA — Number of upper codiagonals of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X an share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSQS (FACT, NCODA, B, X [, …])
Specific:
The specific interface names are S_LFSQS and D_LFSQS.
FORTRAN 77 Interface
Single:
CALL LFSQS (N, FACT, LDFACT, NCODA, B, X)
Double:
The double precision name is DLFSQS.
Description
Routine LFSQS computes the solution for a system of linear algebraic equations having a real symmetric positive definite band coefficient matrix. To compute the solution, the coefficient matrix must first undergo an
RT R factorization. This may be done by calling either LFCQS or LFTQS. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RTy = b and Rx = y.
LFSQS and LFIQS both solve a linear system given its RT R factorization. LFIQS generally takes more time
and produces a more accurate answer than LFSQS. Each iteration of the iterative refinement algorithm used
by LFIQS calls LFSQS.
LFSQS is based on the LINPACK routine SPBSL; see Dongarra et al. (1979).
LFSQS
Chapter 1: Linear Systems
362
Comments
Informational error
Type
Code
Description
4
1
The factored matrix is singular.
Example
A set of linear systems is solved successively. LFTQS is called to factor the coefficient matrix. LFSQS is called
to compute the four solutions for the four right-hand sides. In this case the coefficient matrix is assumed to be
well-conditioned and correctly scaled. Otherwise, it would be better to call LFCQS to perform the factorization, and LFIQS to compute the solutions.
USE LFSQS_INT
USE LFTQS_INT
USE WRRRN_INT
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=3, LDFACT=3, N=4, NCODA=2)
A(LDA,N), B(N,4), FACT(LDFACT,N), X(N,4)
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A in band symmetric form, and B
A = (
(
(
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = ( 4.0
( 6.0
( 15.0
( -7.0
-3.0
10.0
12.0
1.0
9.0
29.0
11.0
14.0
-1.0
3.0
6.0
2.0
)
)
)
)
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
DATA B/4.0, 6.0, 15.0, -7.0, -3.0, 10.0, 12.0, 1.0, 9.0, 29.0,&
11.0, 14.0, -1.0, 3.0, 6.0, 2.0/
Factor the matrix A
CALL LFTQS (A, NCODA, FACT)
Compute the solutions
DO 10 I=1, 4
CALL LFSQS (FACT, NCODA, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRRRN (’X’, X)
!
END
Output
1
X
2
3
4
LFSQS
Chapter 1: Linear Systems
363
1
2
3
4
3.000
1.000
2.000
-2.000
-1.000
2.000
1.000
0.000
5.000
6.000
1.000
3.000
0.000
0.000
1.000
1.000
LFSQS
Chapter 1: Linear Systems
364
LFIQS
Uses iterative refinement to improve the solution of a real symmetric positive definite system of linear equations in band symmetric storage mode.
Required Arguments
A — NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in band symmetric storage mode. (Input)
NCODA — Number of upper codiagonals of A. (Input)
FACT — NCODA + 1 by N array containing the RT R factorization of the matrix A from routine
LFCQS/DLFCQS or LFTQS/DLFTQS. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the system. (Output)
RES — Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFIQS (A, NCODA, FACT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIQS and D_LFIQS.
FORTRAN 77 Interface
Single:
CALL LFIQS (N, A, LDA, NCODA, FACT, LDFACT, B, X, RES)
Double:
The double precision name is DLFIQS.
Description
Routine LFIQS computes the solution of a system of linear algebraic equations having a real symmetric positive-definite band coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is somewhat illconditioned.
To compute the solution, the coefficient matrix must first undergo an RT R factorization. This may be done by
calling either IMSL routine LFCQS or LFTQS.
LFIQS
Chapter 1: Linear Systems
365
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIQS and LFSQS both solve a linear system given its RT R factorization. LFIQS generally takes more time
and produces a more accurate answer than LFSQS. Each iteration of the iterative refinement algorithm used
by LFIQS calls LFSQS.
Comments
Informational error
Type
Code
Description
3
4
The input matrix is too ill-conditioned for iterative refinement to be effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIQS_INT
UMACH_INT
LFCQS_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), B(N), RCOND, FACT(LDFACT,N), RES(N,3),&
X(N,3)
Set values for A in band symmetric form, and B
A = (
(
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
B = (
3.0
5.0
7.0
4.0 )
DATA A/0.0, 2.0, 1.0, 2.5, 1.0, 2.5, 1.0, 2.0/
DATA B/3.0, 5.0, 7.0, 4.0/
!
Factor the matrix A
CALL LFCQS (A, NCODA, FACT, RCOND)
!
Print the estimated condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
!
Compute the solutions
DO 10 I=1, 3
CALL LFIQS (A, NCODA, FACT, B, X(:,I), RES(:,I))
B(2) = B(2) + 0.5E0
10 CONTINUE
!
Print solutions and residuals
CALL WRRRN (’X’, X)
CALL WRRRN (’RES’, RES)
99999 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
END
LFIQS
Chapter 1: Linear Systems
366
Output
RCOND = 0.160
L1 Condition number = 6.239
X
1
2
3
1
1.167
1.000
0.833
2
0.667
1.000
1.333
3
2.167
2.000
1.833
4
0.917
1.000
1.083
RES
1
2
3
1
7.947E-08
0.000E+00
9.934E-08
2
7.947E-08
0.000E+00
3.974E-08
3
7.947E-08
0.000E+00
1.589E-07
4 -3.974E-08
0.000E+00 -7.947E-08
LFIQS
Chapter 1: Linear Systems
367
LFDQS
Computes the determinant of a real symmetric positive definite matrix given the RTR Cholesky factorization
of the matrix in band symmetric storage mode.
Required Arguments
FACT — NCODA + 1 by N array containing the RT R factorization of the positive definite band matrix, A, in
band symmetric storage mode as output from subroutine LFCQS/DLFCQS or LFTQS/DLFTQS. (Input)
NCODA — Number of upper codiagonals of A. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ǀDET1ǀ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDQS (FACT, NCODA, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDQS and D_LFDQS.
FORTRAN 77 Interface
Single:
CALL LFDQS (N, FACT, LDFACT, NCODA, DET1, DET2)
Double:
The double precision name is DLFDQS.
Description
Routine LFDQS computes the determinant of a real symmetric positive-definite band coefficient matrix. To
compute the determinant, the coefficient matrix must first undergo an RT R factorization. This may be done
by calling either IMSL routine LFCQS or LFTQS. The formula
det A = det RT det R = (det R)2 is used to compute the determinant. Since the determinant of a triangular
matrix is the product of the diagonal elements,
GHW5
š
1
L 5LL
LFDQS is based on the LINPACK routine SPBDI; see Dongarra et al. (1979).
LFDQS
Chapter 1: Linear Systems
368
Example
The determinant is computed for a real positive definite 4 × 4 matrix with 2 codiagonals.
USE LFDQS_INT
USE LFTQS_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=3, N=4, LDFACT=3, NCODA=2)
A(LDA,N), DET1, DET2, FACT(LDFACT,N)
Set values for A in band symmetric form
A = (
(
(
0.0
0.0
7.0
0.0
2.0
6.0
1.0
1.0
6.0
-2.0 )
3.0 )
8.0 )
DATA A/2*0.0, 7.0, 0.0, 2.0, 6.0, 1.0, 1.0, 6.0, -2.0, 3.0, 8.0/
Factor the matrix
CALL LFTQS (A, NCODA, FACT)
Compute the determinant
CALL LFDQS (FACT, NCODA, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is ’,F6.3,’ * 10**’,F2.0)
END
Output
The determinant of A is 1.186 * 10**3.
LFDQS
Chapter 1: Linear Systems
369
LSLTQ
Solves a complex tridiagonal system of linear equations.
Required Arguments
C — Complex vector of length N containing the subdiagonal of the tridiagonal matrix in C(2) through C(N).
(Input/Output)
On output C is destroyed.
D — Complex vector of length N containing the diagonal of the tridiagonal matrix. (Input/Output)
On output D is destroyed.
E — Complex vector of length N containing the superdiagonal of the tridiagonal matrix in E(1) through
E(N - 1). (Input/Output)
On output E is destroyed.
B — Complex vector of length N containing the right-hand side of the linear system on entry and the solution vector on return. (Input/Output)
Optional Arguments
N — Order of the tridiagonal matrix. (Input)
Default: N = size (C,1).
FORTRAN 90 Interface
Generic:
CALL LSLTQ (C, D, E, B [, …])
Specific:
The specific interface names are S_LSLTQ and D_LSLTQ.
FORTRAN 77 Interface
Single:
CALL LSLTQ (N, C, D, E, B)
Double:
The double precision name is DLSLTQ.
Description
Routine LSLTQ factors and solves the complex tridiagonal linear system Ax = b. LSLTQ is intended just for
tridiagonal systems. The coefficient matrix does not have to be symmetric. The algorithm is Gaussian elimination with pivoting for numerical stability. See Dongarra et al. (1979), LINPACK subprograms
CGTSL/ZGTSL, for details. When computing on vector or parallel computers the cyclic reduction algorithm,
LSLCQ, should be considered as an alternative method to solve the system.
Comments
Informational error
Type
Code
Description
4
2
An element along the diagonal became exactly zero during execution.
LSLTQ
Chapter 1: Linear Systems
370
Example
A system of n = 4 linear equations is solved.
USE LSLTQ_INT
USE WRCRL_INT
!
Declaration of variables
INTEGER
PARAMETER
N
(N=4)
COMPLEX
CHARACTER
B(N), C(N), D(N), E(N)
CLABEL(1)*6, FMT*8, RLABEL(1)*4
!
!
DATA FMT/’(E13.6)’/
DATA CLABEL/’NUMBER’/
DATA RLABEL/’NONE’/
!
!
!
!
DATA
DATA
DATA
DATA
C(*), D(*), E(*) and B(*)
contain the subdiagonal,
diagonal, superdiagonal and
right hand side.
C/(0.0,0.0), (-9.0,3.0), (2.0,7.0), (7.0,-4.0)/
D/(3.0,-5.0), (4.0,-9.0), (-5.0,-7.0), (-2.0,-3.0)/
E/(-9.0,8.0), (1.0,8.0), (8.0,3.0), (0.0,0.0)/
B/(-16.0,-93.0), (128.0,179.0), (-60.0,-12.0), (9.0,-108.0)/
!
!
CALL LSLTQ (C, D, E, B)
!
Output the solution.
CALL WRCRL (’Solution:’, B, RLABEL, CLABEL, 1, N, 1, FMT=FMT)
END
Output
Solution:
1
(-0.400000E+01,-0.700000E+01)
3
( 0.700000E+01,-0.700000E+01)
2
(-0.700000E+01, 0.400000E+01)
4
( 0.900000E+01, 0.200000E+01)
LSLTQ
Chapter 1: Linear Systems
371
LSLCQ
Computes the LDU factorization of a complex tridiagonal matrix A using a cyclic reduction algorithm.
Required Arguments
C — Complex array of size 2N containing the upper codiagonal of the N by N tridiagonal matrix in the
entries C(1), …, C(N − 1). (Input/Output)
A — Complex array of size 2N containing the diagonal of the N by N tridiagonal matrix in the entries
A(1), …, A(N). (Input/Output)
B — Complex array of size 2N containing the lower codiagonal of the N by N tridiagonal matrix in the
entries B(1), …, B(N − 1). (Input/Output)
Y — Complex array of size 2N containing the right-hand side of the system Ax = y in the order
Y(1), …, Y(N). (Input/Output)
The vector x overwrites Y in storage.
U — Real array of size 2N of flags that indicate any singularities of A. (Output)
A value U(I) = 1. means that a divide by zero would have occurred during the factoring. Otherwise
U(I) = 0.
IR — Array of integers that determine the sizes of loops performed in the cyclic reduction algorithm.
(Output)
IS — Array of integers that determine the sizes of loops performed in the cyclic reduction algorithm. (Output)
The sizes of these arrays must be at least log2 (N) + 3.
Optional Arguments
N — Order of the matrix. (Input)
N must be greater than zero.
Default: N = size (C,1).
IJOB — Flag to direct the desired factoring or solving step. (Input)
Default: IJOB =1.
IJOB
Action
1
Factor the matrix A and solve the system Ax = y, where y is stored
in array Y.
2
Do the solve step only. Use y from array Y. (The factoring step has
already been done.)
3
Factor the matrix A but do not solve a system.
4
Same meaning as with the value IJOB = 3. For efficiency, no error
checking is done on the validity of any input value.
FORTRAN 90 Interface
Generic:
CALL LSLCQ (C, A, B, Y, U, IR, IS [, …])
Specific:
The specific interface names are S_LSLCQ and D_LSLCQ.
LSLCQ
Chapter 1: Linear Systems
372
FORTRAN 77 Interface
Single:
CALL LSLCQ (N, C, A, B, IJOB, Y, U, IR, IS)
Double:
The double precision name is DLSLCQ.
Description
Routine LSLCQ factors and solves the complex tridiagonal linear system Ax = y. The matrix is decomposed in
the form A = LDU, where L is unit lower triangular, U is unit upper triangular, and D is diagonal. The algorithm used for the factorization is effectively that described in Kershaw (1982). More details, tests and
experiments are reported in Hanson (1990).
LSLCQ is intended just for tridiagonal systems. The coefficient matrix does not have to be Hermitian. The
algorithm amounts to Gaussian elimination, with no pivoting for numerical stability, on the matrix whose
rows and columns are permuted to a new order. See Hanson (1990) for details. The expectation is that LSLCQ
will outperform either LSLTQ or LSLQB on vector or parallel computers. Its performance may be inferior for
small values of n, on scalar computers, or high-performance computers with non-optimizing compilers.
Example
A real skew-symmetric tridiagonal matrix, A, of dimension n = 1000 is given by ck = −k, ak = 0, and
bk = k, k = 1, …, n − 1, an = 0. This matrix will have eigenvalues that are purely imaginary. The eigenvalue
closest to the imaginary unit is required. This number is obtained by using inverse iteration to approximate a
complex eigenvector y. The eigenvalue is approximated by Ȝ \+ $\ \+ \. (This example is contrived in
the sense that the given tridiagonal skew-symmetric matrix eigenvalue problem is essentially equivalent to
the tridiagonal symmetic eigenvalue problem where the ck = k and the other data are unchanged.)
USE LSLCQ_INT
USE UMACH_INT
!
INTEGER
PARAMETER
Declare variables
LP, N, N2
(LP=12, N=1000, N2=2*N)
!
INTEGER
REAL
COMPLEX
INTRINSIC
!
!
!
!
!
!
I, IJOB, IR(LP), IS(LP), K, NOUT
AIMAG, U(N2)
A(N2), B(N2), C(N2), CMPLX, CONJG, S, T, Y(N2)
AIMAG, CMPLX, CONJG
Define entries of skew-symmetric
matrix, A:
DO 10 I=1, N - 1
C(I) = -I
This amounts to subtracting the
positive imaginary unit from the
diagonal. (The eigenvalue closest
to this value is desired.)
A(I) = CMPLX(0.E0,-1.0E0)
B(I) = I
LSLCQ
Chapter 1: Linear Systems
373
!
!
This initializes the approximate
eigenvector.
Y(I) = 1.E0
10 CONTINUE
A(N) = CMPLX(0.E0,-1.0E0)
Y(N) = 1.E0
!
!
!
First step of inverse iteration
follows. Obtain decomposition of
matrix and solve the first system:
IJOB = 1
CALL LSLCQ (C, A, B, Y, U, IR, IS, N=N, IJOB=IJOB)
!
!
!
!
Next steps of inverse iteration
follow. Solve the system again with
the decomposition ready:
IJOB = 2
DO 20 K=1, 3
CALL LSLCQ (C, A, B, Y, U, IR, IS, N=N, IJOB=IJOB)
20 CONTINUE
!
!
!
!
!
!
!
!
Compute the Raleigh quotient to
estimate the eigenvalue closest to
the positive imaginary unit. After
the approximate eigenvector, y, is
computed, the estimate of the
eigenvalue is ctrans(y)*A*y/t,
where t = ctrans(y)*y.
S = -CONJG(Y(1))*Y(2)
T = CONJG(Y(1))*Y(1)
DO 30 I=2, N - 1
S = S + CONJG(Y(I))*((I-1)*Y(I-1)-I*Y(I+1))
T = T + CONJG(Y(I))*Y(I)
30 CONTINUE
S = S + CONJG(Y(N))*(N-1)*Y(N-1)
T = T + CONJG(Y(N))*Y(N)
S = S/T
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ’ The value of n is: ’, N
WRITE (NOUT,*) ’ Value of approximate imaginary eigenvalue:’,&
AIMAG(S)
STOP
END
Output
The value of n is:
1000
Value of approximate imaginary eigenvalue:
1.03811
LSLCQ
Chapter 1: Linear Systems
374
LSACB
Solves a complex system of linear equations in band storage mode with iterative refinement.
Required Arguments
A — Complex NLCA + NUCA + 1 by N array containing the N by N banded coefficient matrix in band storage
mode. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSACB (A, NLCA, NUCA, B, X [, …])
Specific:
The specific interface names are S_LSACB and D_LSACB.
FORTRAN 77 Interface
Single:
CALL LSACB (N, A, LDA, NLCA, NUCA, B, IPATH, X)
Double:
The double precision name is DLSACB.
Description
Routine LSACB solves a system of linear algebraic equations having a complex banded coefficient matrix. It
first uses the routine LFCCB to compute an LU factorization of the coefficient matrix and to estimate the condition number of the matrix. The solution of the linear system is then found using the iterative refinement
routine LFICB.
LSACB fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the iterative
refinement algorithm fails to converge. These errors occur only if A is singular or very close to a singular
matrix.
LSACB
Chapter 1: Linear Systems
375
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSACB solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ACB/DL2ACB. The reference is:
CALL L2ACB (N, A, LDA, NLCA, NUCA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — Complex work vector of length (2 * NLCA + NUCA + 1) * N containing the LU factorization of A on output.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
3
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2ACB the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSACB. Additional memory allocation for FACT and option value restoration are
done automatically in LSACB. Users directly calling L2ACB can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSACB or L2ACB.
Default values for the option are IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSACB temporarily replaces IVAL(2) by IVAL(1). The routine L2CCB computes the condition
number if IVAL(2) = 2. Otherwise L2CCB skips this computation. LSACB restores the option.
Default values for the option are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has complex banded form with one upper
and one lower codiagonal. The right-hand-side vector b has four elements.
USE LSACB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
!
LSACB
Chapter 1: Linear Systems
376
!
!
!
!
!
!
!
!
!
!
Set values for A in band form, and B
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
DATA A/(0.0,0.0), (-2.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(1.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
DATA B/(-10.0,-5.0), (9.5,5.5), (12.0,-12.0), (0.0,8.0)/
Solve A*X = B
CALL LSACB (A, NLCA, NUCA, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
LSACB
Chapter 1: Linear Systems
377
LSLCB
Solves a complex system of linear equations in band storage mode without iterative refinement.
Required Arguments
A — Complex NLCA + NUCA + 1 by N array containing the N by N banded coefficient matrix in band storage
mode. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, then B and X may share the same storage locations)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLCB (A, NLCA, NUCA, B, X [, …])
Specific:
The specific interface names are S_LSLCB and D_LSLCB.
FORTRAN 77 Interface
Single:
CALL LSLCB (N, A, LDA, NLCA, NUCA, B, IPATH, X)
Double:
The double precision name is DLSLCB.
Description
Routine LSLCB solves a system of linear algebraic equations having a complex banded coefficient matrix. It
first uses the routine LFCCB to compute an LU factorization of the coefficient matrix and to estimate the condition number of the matrix. The solution of the linear system is then found using LFSCB.
LSLCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This occurs only
if A is singular or very close to a singular matrix.
LSLCB
Chapter 1: Linear Systems
378
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSACB be used.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LCB/DL2LCB The reference is:
CALL L2LCB (N, A, LDA, NLCA, NUCA, B, IPATH, X, FACT, IPVT, WK)
The additional arguments are as follows:
FACT — (2 * NLCA + NUCA + 1) × N complex work array containing the LU factorization of A on
output. If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT.
IPVT — Integer work vector of length N containing the pivoting information for the LU factorization of A on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
3
The input matrix is too ill-conditioned. The solution might not be accurate.
4
2
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LCB the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLCB. Additional memory allocation for FACT and option value restoration are
done automatically in LSLCB. Users directly calling L2LCB can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLCB or L2LCB.
Default values for the option are IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLCB temporarily replaces IVAL(2) by IVAL(1). The routine L2CCB computes the condition
number if IVAL(2) = 2. Otherwise L2CCB skips this computation. LSLCB restores the option.
Default values for the option are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has complex banded form with one upper
and one lower codiagonal. The right-hand-side vector b has four elements.
USE LSLCB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
LSLCB
Chapter 1: Linear Systems
379
!
!
!
!
!
!
!
!
!
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
DATA A/(0.0,0.0), (-2.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(1.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
DATA B/(-10.0,-5.0), (9.5,5.5), (12.0,-12.0), (0.0,8.0)/
Solve A*X = B
CALL LSLCB (A, NLCA, NUCA, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
LSLCB
Chapter 1: Linear Systems
380
LFCCB
Computes the LU factorization of a complex matrix in band storage mode and estimate its L1 condition
number.
Required Arguments
A — Complex NLCA + NUCA + 1 by N array containing the N by N matrix in band storage mode to be factored. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the matrix A. (Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT .
IPVT — Vector of length N containing the pivoting information for the LU factorization. (Output)
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCCB (A, NLCA, NUCA, FACT, IPVT, RCOND [, …])
Specific:
The specific interface names are S_LFCCB and D_LFCCB.
FORTRAN 77 Interface
Single:
CALL LFCCB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND)
Double:
The double precision name is DLFCCB.
Description
Routine LFCCB performs an LU factorization of a complex banded coefficient matrix. It also estimates the
condition number of the matrix. The LU factorization is done using scaled partial pivoting. Scaled partial
pivoting differs from partial pivoting in that the pivoting strategy is the same as if each row were scaled to
have the same ∞-norm.
LFCCB
Chapter 1: Linear Systems
381
The L1 condition number of the matrix A is defined to be κ(A) = ∥A∥1∥A-1∥1 Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A is singular or very close to a singular matrix.
The LU factors are returned in a form that is compatible with IMSL routines LFICB, LFSCB and LFDCB. To
solve systems of equations with multiple right-hand-side vectors, use LFCCB followed by either LFICB or
LFSCB called once for each right-hand side. The routine LFDCB can be called to compute the determinant of
the coefficient matrix after LFCCB has performed the factorization.
Let F be the matrix FACT, let ml = NLCA and let mu = NUCA. The first ml + mu + 1 rows of F contain the triangular matrix U in band storage form. The lower ml rows of F contain the multipliers needed to reconstruct L.
LFCCB is based on the LINPACK routine CGBCO; see Dongarra et al. (1979). CGBCO uses unscaled partial
pivoting.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CCB/DL2CCB. The reference is:
CALL L2CCB
(N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND, WK)
The additional argument is
WK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
4
2
The input matrix is singular.
Example
The inverse of a 4 × 4 band matrix with one upper and one lower codiagonal is computed. LFCCB is called to
factor the matrix and to check for singularity or ill-conditioning. LFICB is called to determine the columns of
the inverse.
USE
USE
USE
USE
LFCCB_INT
UMACH_INT
LFICB_INT
WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
LFCCB
Chapter 1: Linear Systems
382
REAL
COMPLEX
!
!
!
!
!
!
!
RCOND
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N), RES(N)
Set values for A in band form
A = (
(
(
0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
0.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
6.0+1.0i 4.0+1.0i 0.0+2.0i 0.0+0.0i )
DATA A/(0.0,0.0), (0.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(4.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
!
!
!
!
!
!
!
!
!
!
!
CALL LFCCB (A, NLCA, NUCA, FACT, IPVT, RCOND)
Print the reciprocal condition number
and the L1 condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
RJ is the J-th column of the identity
matrix so the following LFICB
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFICB (A, NLCA, NUCA, FACT, IPVT, RJ, AINV(:,J), RES)
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print results
CALL WRCRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 condition number = ’,F6.3)
Output
RCOND = 0.022
L1 condition number = 45.933
1
2
3
4
(
(
(
(
0.562,
0.122,
0.034,
0.938,
1
0.170)
0.421)
0.904)
0.870)
( 0.125,
(-0.195,
(-0.437,
(-0.347,
AINV
2
0.260)
0.094)
0.090)
0.527)
3
(-0.385,-0.135)
( 0.101,-0.289)
(-0.153,-0.527)
(-0.679,-0.374)
4
(-0.239,-1.165)
( 0.874,-0.179)
( 1.087,-1.172)
( 0.415,-1.759)
LFCCB
Chapter 1: Linear Systems
383
LFTCB
Computes the LU factorization of a complex matrix in band storage mode.
Required Arguments
A — Complex NLCA + NUCA + 1 by N array containing the N by N matrix in band storage mode to be factored. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the matrix A. (Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT.
IPVT — Integer vector of length N containing the pivoting information for the LU factorization. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTCB (A, NLCA, NUCA, FACT, IPVT [, …])
Specific:
The specific interface names are S_LFTCB and D_LFTCB.
FORTRAN 77 Interface
Single:
CALL LFTCB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT)
Double:
The double precision name is DLFTCB.
Description
Routine LFTCB performs an LU factorization of a complex banded coefficient matrix. The LU factorization is
done using scaled partial pivoting. Scaled partial pivoting differs from partial pivoting in that the pivoting
strategy is the same as if each row were scaled to have the same ∞-norm.
LFTCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This can occur
only if A is singular or very close to a singular matrix.
LFTCB
Chapter 1: Linear Systems
384
The LU factors are returned in a form that is compatible with routines LFICB, LFSCB and LFDCB. To solve
systems of equations with multiple right-hand-side vectors, use LFTCB followed by either LFICB or LFSCB
called once for each right-hand side. The routine LFDCB can be called to compute the determinant of the coefficient matrix after LFTCB has performed the factorization.
Let F be the matrix FACT, let ml = NLCA and let mu = NUCA. The first ml + mu + 1 rows of F contain the triangular matrix U in band storage form. The lower ml rows of F contain the multipliers needed to reconstruct L-1.
LFTCB is based on the LINPACK routine CGBFA; see Dongarra et al. (1979). CGBFA uses unscaled partial
pivoting.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2TCB/DL2TCB The reference is:
CALL L2TCB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, WK)
The additional argument is:
WK — Complex work vector of length N used for scaling.
2.
Informational error
Type
Code
Description
4
2
The input matrix is singular.
Example
A linear system with multiple right-hand sides is solved. LFTCB is called to factor the coefficient matrix.
LFSCB is called to compute the two solutions for the two right-hand sides. In this case the coefficient matrix
is assumed to be well-conditioned and correctly scaled. Otherwise, it would be better to call LFCCB to perform the factorization, and LFICB to compute the solutions.
USE LFTCB_INT
USE LFSCB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N,2), FACT(LDFACT,N), X(N,2)
Set values for A in band form, and B
A = (
(
(
0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
0.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
6.0+1.0i 4.0+1.0i 0.0+2.0i 0.0+0.0i )
B = (
(
(
(
-4.0-5.0i
9.5+5.5i
9.0-9.0i
0.0+8.0i
16.0-4.0i )
-9.5+19.5i )
12.0+12.0i )
-8.0-2.0i )
DATA A/(0.0,0.0), (0.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
LFTCB
Chapter 1: Linear Systems
385
(4.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
DATA B/(-4.0,-5.0), (9.5,5.5), (9.0,-9.0), (0.0,8.0),&
(16.0,-4.0), (-9.5,19.5), (12.0,12.0), (-8.0,-2.0)/
!
!
!
CALL LFTCB (A, NLCA, NUCA, FACT, IPVT)
Solve for the two right-hand sides
DO 10 J=1, 2
CALL LFSCB (FACT, NLCA, NUCA, IPVT, B(:,J), X(:,J))
10 CONTINUE
Print results
CALL WRCRN (’X’, X)
!
END
Output
X
1
2
3
4
( 3.000,
(-1.000,
( 3.000,
(-1.000,
1
0.000)
1.000)
0.000)
1.000)
(
(
(
(
2
0.000, 4.000)
1.000,-1.000)
0.000, 4.000)
1.000,-1.000)
LFTCB
Chapter 1: Linear Systems
386
LFSCB
Solves a complex system of linear equations given the LU factorization of the coefficient matrix in band storage mode.
Required Arguments
FACT — Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the coefficient matrix
A as output from subroutine LFCCB/DLFCCB or LFTCB/DLFTCB. (Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
subroutine LFCCB/DLFCCB or LFTCB/DLFTCB. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSCB (FACT, NLCA, NUCA, IPVT, B, X [, …])
Specific:
The specific interface names are S_LFSCB and D_LFSCB.
FORTRAN 77 Interface
Single:
CALL LFSCB (N, FACT, LDFACT, NLCA, NUCA, IPVT, B, IPATH, X)
Double:
The double precision name is DLFSCB.
Description
Routine LFSCB computes the solution of a system of linear algebraic equations having a complex banded
coefficient matrix. To compute the solution, the coefficient matrix must first undergo an LU factorization.
This may be done by calling either LFCCB or LFTCB. The solution to Ax = b is found by solving the banded
triangular systems Ly = b and Ux = y. The forward elimination step consists of solving the system Ly = b by
LFSCB
Chapter 1: Linear Systems
387
applying the same permutations and elimination operations to b that were applied to the columns of A in the
factorization routine. The backward substitution step consists of solving the banded triangular system Ux = y
for x.
LFSCB and LFICB both solve a linear system given its LU factorization. LFICB generally takes more time
and produces a more accurate answer than LFSCB. Each iteration of the iterative refinement algorithm used
by LFICB calls LFSCB.
LFSCB is based on the LINPACK routine CGBSL; see Dongarra et al. (1979).
Example
The inverse is computed for a real banded 4 × 4 matrix with one upper and one lower codiagonal. The input
matrix is assumed to be well-conditioned; hence LFTCB is used rather than LFCCB.
USE LFSCB_INT
USE LFTCB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
Set values for A in band form
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
DATA A/(0.0,0.0), (-2.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(1.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
!
!
!
!
!
!
!
!
!
CALL LFTCB (A, NLCA, NUCA, FACT, IPVT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
RJ is the J-th column of the identity
matrix so the following LFSCB
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSCB (FACT, NLCA, NUCA, IPVT, RJ, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print results
CALL WRCRN (’AINV’, AINV)
!
END
LFSCB
Chapter 1: Linear Systems
388
Output
1
2
3
4
(
(
(
(
1
0.165,-0.341)
0.588,-0.047)
0.318, 0.271)
0.588,-0.047)
(
(
(
(
2
0.376,-0.094)
0.259, 0.235)
0.012, 0.247)
0.259, 0.235)
3
(-0.282, 0.471)
(-0.494, 0.024)
(-0.759,-0.235)
(-0.994, 0.524)
4
(-1.600, 0.000)
(-0.800,-1.200)
(-0.550,-2.250)
(-2.300,-1.200)
LFSCB
Chapter 1: Linear Systems
389
LFICB
Uses iterative refinement to improve the solution of a complex system of linear equations in band storage
mode.
Required Arguments
A — Complex NLCA + NUCA + 1 by N array containing the N by N coefficient matrix in band storage mode.
(Input)
NLCA — Number of lower codiagonals of A. (Input)
NUCA — Number of upper codiagonals of A. (Input)
FACT — Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the matrix A as output from routine LFCCB/DLFCCB or LFTCB/DLFTCB. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization of A as output from
routine LFCCB/DLFCCB or LFTCB/DLFTCB. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution. (Output)
RES — Complex vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system AHX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFICB (A, NLCA, NUCA, FACT, IPVT, B, X, RES [, …])
Specific:
The specific interface names are S_LFICB and D_LFICB.
FORTRAN 77 Interface
Single:
CALL LFICB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, B, IPATH, X, RES)
Double:
The double precision name is DLFICB.
LFICB
Chapter 1: Linear Systems
390
Description
Routine LFICB computes the solution of a system of linear algebraic equations having a complex banded
coefficient matrix. Iterative refinement is performed on the solution vector to improve the accuracy. Usually
almost all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCCB or LFTCB.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFICB and LFSCB both solve a linear system given its LU factorization. LFICB generally takes more time
and produces a more accurate answer than LFSCB. Each iteration of the iterative refinement algorithm used
by LFICB calls LFSCB.
Comments
Informational error
Type
Code
Description
3
3
The input matrix is too ill-conditioned for iterative refinement be effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving the system each of the first two times by adding (1 + i)/2 to the second element.
USE
USE
USE
USE
LFICB_INT
LFCCB_INT
WRCRN_INT
UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
RCOND
A(LDA,N), B(N), FACT(LDFACT,N), RES(N), X(N)
Set values for A in band form, and B
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
DATA A/(0.0,0.0), (-2.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(1.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
DATA B/(-10.0,-5.0), (9.5,5.5), (12.0,-12.0), (0.0,8.0)/
!
CALL LFCCB (A, NLCA, NUCA, FACT, IPVT, RCOND)
LFICB
Chapter 1: Linear Systems
391
!
!
!
!
Print the reciprocal condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
Solve the three systems
DO 10 J=1, 3
CALL LFICB (A, NLCA, NUCA, FACT, IPVT, B, X, RES)
Print results
WRITE (NOUT, 99999) J
CALL WRCRN (’X’, X, 1, N, 1)
CALL WRCRN (’RES’, RES, 1, N, 1)
Perturb B by adding 0.5+0.5i to B(2)
B(2) = B(2) + (0.5E0,0.5E0)
10 CONTINUE
!
99998 FORMAT (’ RCOND = ’,F5.3,/,’ L1 Condition number = ’,F6.3)
99999 FORMAT (//,’ For system ’,I1)
END
Output
RCOND = 0.014
L1 Condition number = 72.414
For system 1
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
RES
1
( 0.000E+00, 0.000E+00)
4
( 3.494E-22,-6.698E-22)
2
( 0.000E+00, 0.000E+00)
3
( 0.000E+00, 5.684E-14)
For system 2
X
1
( 3.235, 0.141)
2
(-0.988, 1.247)
3
( 2.882, 0.129)
4
(-0.988, 1.247)
RES
1
(-1.402E-08, 6.486E-09)
4
(-7.012E-10, 4.488E-08)
2
(-7.012E-10, 4.488E-08)
3
(-1.122E-07, 7.188E-09)
For system 3
X
1
( 3.471, 0.282)
2
(-0.976, 1.494)
3
( 2.765, 0.259)
4
(-0.976, 1.494)
RES
1
(-2.805E-08, 1.297E-08)
4
2
(-1.402E-09,-2.945E-08)
3
( 1.402E-08, 1.438E-08)
LFICB
Chapter 1: Linear Systems
392
(-1.402E-09,-2.945E-08)
LFICB
Chapter 1: Linear Systems
393
LFDCB
Computes the determinant of a complex matrix given the LU factorization of the matrix in band storage
mode.
Required Arguments
FACT — Complex (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A as output from routine LFTCB/DLFTCB or LFCCB/DLFCCB. (Input)
NLCA — Number of lower codiagonals in matrix A. (Input)
NUCA — Number of upper codiagonals in matrix A. (Input)
IPVT — Vector of length N containing the pivoting information for the LU factorization as output from
routine LFTCB/DLFTCB or LFCCB/DLFCCB. (Input)
DET1 — Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1 ∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det (A) = DET1 * 10DET2.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDCB (FACT, NLCA, NUCA, IPVT, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDCB and D_LFDCB.
FORTRAN 77 Interface
Single:
CALL LFDCB (N, FACT, LDFACT, NLCA, NUCA, IPVT, DET1, DET2)
Double:
The double precision name is DLFDCB.
Description
Routine LFDCB computes the determinant of a complex banded coefficient matrix. To compute the determinant, the coefficient matrix must first undergo an LU factorization. This may be done by calling either LFCCB
or LFTCB. The formula det A = det L det U is used to compute the determinant. Since the determinant of a
triangular matrix is the product of the diagonal elements,
LFDCB
Chapter 1: Linear Systems
394
GHW8
š
1
8 LL
L (The matrix U is stored in the upper NUCA + NLCA + 1 rows of FACT as a banded matrix.) Since L is the product of triangular matrices with unit diagonals and of permutation matrices, det L = (-1)k, where k is the
number of pivoting interchanges.
LFDCB is based on the LINPACK routine CGBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a complex banded 4 × 4 matrix with one upper and one lower codiagonal.
USE LFDCB_INT
USE LFTCB_INT
USE UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
DET2
A(LDA,N), DET1, FACT(LDFACT,N)
Set values for A in band form
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
DATA A/(0.0,0.0), (-2.0,-3.0), (6.0,1.0), (4.0,0.0), (-0.5,3.0),&
(1.0,1.0), (-2.0,2.0), (3.0,-3.0), (0.0,2.0), (-4.0,-1.0),&
(1.0,-1.0), (0.0,0.0)/
!
!
!
CALL LFTCB (A, NLCA, NUCA, FACT, IPVT)
Compute the determinant
CALL LFDCB (FACT, NLCA, NUCA, IPVT, DET1, DET2)
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is (’, F6.3, ’,’, F6.3, ’) * 10**’,&
F2.0)
END
Output
The determinant of A is ( 2.500,-1.500) * 10**1.
LFDCB
Chapter 1: Linear Systems
395
LSAQH
Solves a complex Hermitian positive definite system of linear equations in band Hermitian storage mode
with iterative refinement.
Required Arguments
A — Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian coefficient
matrix in band Hermitian storage mode. (Input)
NCODA — Number of upper or lower codiagonals of A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSAQH (A, NCODA, B, X [, …])
Specific:
The specific interface names are S_LSAQH and D_LSAQH.
FORTRAN 77 Interface
Single:
CALL LSAQH (N, A, LDA, NCODA, B, X)
Double:
The double precision name is DLSAQH.
Description
Routine LSAQH solves a system of linear algebraic equations having a complex Hermitian positive definite
band coefficient matrix. It first uses the IMSL routine LFCQH to compute an RH R Cholesky factorization of
the coefficient matrix and to estimate the condition number of the matrix. R is an upper triangular band
matrix. The solution of the linear system is then found using the iterative refinement IMSL routine LFIQH.
LSAQH fails if any submatrix of R is not positive definite, if R has a zero diagonal element, or if the iterative
refinement agorithm fails to converge. These errors occur only if the matrix A either is very close to a singular
matrix or is a matrix that is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system. LSAQH solves the problem that is represented in
the computer; however, this problem may differ from the problem whose solution is desired.
LSAQH
Chapter 1: Linear Systems
396
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2AQH/DL2AQH The reference is:
CALL L2AQH (N, A, LDA, NCODA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — Complex work vector of length (NCODA + 1) * N containing the RH R factorization of A in
band Hermitian storage form on output.
WK — Complex work vector of length N.
2.
3.
Informational errors
Type
Code
Description
3
3
The input matrix is too ill-conditioned. The solution might not be accurate.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2AQH the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSAQH. Additional memory allocation for FACT and option value restoration are
done automatically in LSAQH. Users directly calling L2AQH can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSAQH or L2AQH.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine LSAQH temporarily replaces IVAL(2) by IVAL(1). The routine L2CQH computes the
condition number if IVAL(2) = 2. Otherwise L2CQH skips this computation. LSAQH restores the
option. Default values for the option are IVAL(*) = 1, 2.
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive definite
band form with one codiagonal and the right-hand-side vector b has five elements.
USE LSAQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
Declare variables
LDA, N, NCODA
(LDA=2, N=5, NCODA=1)
A(LDA,N), B(N), X(N)
Set values for A in band Hermitian form, and B
A = ( 0.0+0.0i -1.0+1.0i 1.0+2.0i
( 2.0+0.0i 4.0+0.0i 10.0+0.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
LSAQH
Chapter 1: Linear Systems
397
!
!
!
!
!
B = ( 1.0+5.0i 12.0-6.0i
1.0-16.0i -3.0-3.0i 25.0+16.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B/(1.0,5.0), (12.0,-6.0), (1.0,-16.0), (-3.0,-3.0),&
(25.0,16.0)/
Solve A*X = B
CALL LSAQH (A, NCODA, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSAQH
Chapter 1: Linear Systems
398
LSLQH
Solves a complex Hermitian positive definite system of linear equations in band Hermitian storage mode
without iterative refinement.
Required Arguments
A — Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian coefficient
matrix in band Hermitian storage mode. (Input)
NCODA — Number of upper or lower codiagonals of A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
CALL LSLQH (A, NCODA, B, X [, …])
Specific:
The specific interface names are S_LSLQH and D_LSLQH.
FORTRAN 77 Interface
Single:
CALL LSLQH (N, A, LDA, NCODA, B, X)
Double:
The double precision name is DLSLQH.
Description
Routine LSLQH solves a system of linear algebraic equations having a complex Hermitian positive definite
band coefficient matrix. It first uses the routine LFCQH to compute an RH R Cholesky factorization of the coefficient matrix and to estimate the condition number of the matrix. R is an upper triangular band matrix. The
solution of the linear system is then found using the routine LFSQH.
LSLQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A either is very close to a singular matrix or is a matrix that is not positive definite.
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. If the coefficient matrix is ill-conditioned or poorly sealed, it is recommended that LSAQH be used.
LSLQH
Chapter 1: Linear Systems
399
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LQH/DL2LQH The reference is:
CALL L2LQH (N, A, LDA, NCODA, B, X, FACT, WK)
The additional arguments are as follows:
FACT — (NCODA + 1) × N complex work array containing the RH R factorization of A in band Hermitian storage form on output. If A is not needed, A and FACT can share the same storage
locations.
WK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
3
The input matrix is too ill-conditioned. The solution might not be accurate.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
3.Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2LQH the leading dimension of FACT is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSLQH. Additional memory allocation for FACT and option value restoration are
done automatically in LSLQH. Users directly calling L2LQH can allocate additional space for
FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies.
There is no requirement that users change existing applications that use LSLQH or L2LQH.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSLQH temporarily replaces IVAL(2) by IVAL(1). The routine L2CQH computes the condition
number if IVAL(2) = 2. Otherwise L2CQH skips this computation. LSLQH restores the option.
Default values for the option are IVAL(*) = 1, 2.
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive definite
band form with one codiagonal and the right-hand-side vector b has five elements.
USE LSLQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
Declare variables
N, NCODA, LDA
(N=5, NCODA=1, LDA=NCODA+1)
A(LDA,N), B(N), X(N)
Set values for A in band Hermitian form, and B
A = ( 0.0+0.0i -1.0+1.0i
1.0+2.0i
0.0+4.0i
1.0+1.0i )
LSLQH
Chapter 1: Linear Systems
400
!
!
!
!
!
!
( 2.0+0.0i
4.0+0.0i 10.0+0.0i
B = ( 1.0+5.0i 12.0-6.0i
6.0+0.0i
9.0+0.0i )
1.0-16.0i -3.0-3.0i 25.0+16.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B/(1.0,5.0), (12.0,-6.0), (1.0,-16.0), (-3.0,-3.0),&
(25.0,16.0)/
Solve A*X = B
CALL LSLQH (A, NCODA, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
!
END
Output
X
1
( 2.000, 1.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
5
( 3.000, 2.000)
LSLQH
Chapter 1: Linear Systems
401
LSLQB
Computes the RH DR Cholesky factorization of a complex Hermitian positive-definite matrix A in codiagonal
band Hermitian storage mode. Solve a system Ax = b.
Required Arguments
A — Array containing the N by N positive-definite band coefficient matrix and the right hand side in codiagonal band Hermitian storage mode. (Input/Output)
The number of array columns must be at least 2 * NCODA + 3. The number of columns is not an input
to this subprogram.
NCODA — Number of upper codiagonals of matrix A. (Input)
Must satisfy NCODA ≥ 0 and NCODA < N.
U — Array of flags that indicate any singularities of A, namely loss of positive-definiteness of a leading
minor. (Output)
A value U(I) = 0. means that the leading minor of dimension I is not positive-definite. Otherwise,
U(I) = 1.
Optional Arguments
N — Order of the matrix. (Input)
Must satisfy N > 0.
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Must satisfy LDA ≥ N + NCODA.
Default: LDA = size (A,1).
IJOB — flag to direct the desired factorization or solving step. (Input)
Default: IJOB = 1.
IJOB Meaning
1
factor the matrix A and solve the system Ax = b; where the real part of b is stored in column
2 * NCODA + 2 and the imaginary part of b is stored in column 2 * NCODA + 3 of array A. The real
and imaginary parts of b are overwritten by the real and imaginary parts of x.
2
solve step only. Use the real part of b as column 2 * NCODA + 2 and the imaginary part of b as column 2 * NCODA + 3 of A. (The factorization step has already been done.) The real and imaginary
parts of b are overwritten by the real and imaginary parts of x.
3
factor the matrix A but do not solve a system.
4,5,6 same meaning as with the value IJOB - 3. For efficiency, no error checking is done on values
LDA, N, NCODA, and U(*).
FORTRAN 90 Interface
Generic:
CALL LSLQB (A, NCODA, U [, …])
Specific:
The specific interface names are S_LSLQB and D_LSLQB.
LSLQB
Chapter 1: Linear Systems
402
FORTRAN 77 Interface
Single:
CALL LSLQB (N, A, LDA, NCODA, IJOB, U)
Double:
The double precision name is DLSLQB.
Description
Routine LSLQB factors and solves the Hermitian positive definite banded linear system Ax = b. The matrix is
factored so that A = RH DR, where R is unit upper triangular and D is diagonal and real. The reciprocals of
the diagonal entries of D are computed and saved to make the solving step more efficient. Errors will occur if
D has a nonpositive diagonal element. Such events occur only if A is very close to a singular matrix or is not
positive definite.
LSLQB is efficient for problems with a small band width. The particular cases NCODA = 0, 1 are done with
special loops within the code. These cases will give good performance. See Hanson (1989) for more on the
algorithm. When solving tridiagonal systems, NCODA = 1, the cyclic reduction code LSLCQ should be considered as an alternative. The expectation is that LSLCQ will outperform LSLQB on vector or parallel computers.
It may be inferior on scalar computers or even parallel computers with non-optimizing compilers.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LQB/DL2LQB The reference is:
CALL L2LQB (N, A, LDA, NCODA, IJOB, U, WK1, WK2)
The additional arguments are as follows:
WK1 — Work vector of length NCODA.
WK2 — Work vector of length NCODA.
2.
Informational error
Type
Code
Description
4
2
The input matrix is not positive definite.
Example
A system of five linear equations is solved. The coefficient matrix has real positive definite codiagonal Hermitian band form and the right-hand-side vector b has five elements.
USE LSLQB_INT
USE WRRRN_INT
INTEGER
LDA, N, NCODA
PARAMETER (N=5, NCODA=1, LDA=N+NCODA)
!
INTEGER
REAL
!
!
!
!
!
I, IJOB, J
A(LDA,2*NCODA+3), U(N)
Set values for A and right hand side
in codiagonal band Hermitian form:
(
*
*
*
*
* )
LSLQB
Chapter 1: Linear Systems
403
!
!
!
!
!
!
A
=
( 2.0
( 4.0
(10.0
( 6.0
( 9.0
*
-1.0
1.0
0.0
1.0
*
1.0
2.0
4.0
1.0
1.0
5.0)
12.0 -6.0)
1.0 -16.0)
-3.0 -3.0)
25.0 16.0)
DATA ((A(I+NCODA,J),I=1,N),J=1,2*NCODA+3)/2.0, 4.0, 10.0, 6.0,&
9.0, 0.0, -1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 2.0, 4.0, 1.0,&
1.0, 12.0, 1.0, -3.0, 25.0, 5.0, -6.0, -16.0, -3.0, 16.0/
!
!
!
Factor and solve A*x = b.
IJOB = 1
CALL LSLQB (A, NCODA, U)
!
!
!
Print results
CALL WRRRN (’REAL(X)’, A((NCODA+1):,(2*NCODA+2):), 1, N, 1)
CALL WRRRN (’IMAG(X)’, A((NCODA+1):,(2*NCODA+3):), 1, N, 1)
END
Output
1
2.000
2
3.000
REAL(X)
3
4
-1.000
0.000
5
3.000
1
1.000
2
0.000
IMAG(X)
3
4
-1.000 -2.000
5
2.000
LSLQB
Chapter 1: Linear Systems
404
LFCQH
Computes the RH R factorization of a complex Hermitian positive definite matrix in band Hermitian storage
mode and estimate its L1 condition number.
Required Arguments
A — Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian matrix to be
factored in band Hermitian storage mode. (Input)
NCODA — Number of upper or lower codiagonals of A. (Input)
FACT — Complex NCODA + 1 by N array containing the RH R factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
RCOND — Scalar containing an estimate of the reciprocal of the L1 condition number of A. (Output)
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFCQH (A, NCODA, FACT, RCOND [, …])
Specific:
The specific interface names are S_LFCQH and D_LFCQH.
FORTRAN 77 Interface
Single:
CALL LFCQH (N, A, LDA, NCODA, FACT, LDFACT, RCOND)
Double:
The double precision name is DLFCQH.
Description
Routine LFCQH computes an RH R Cholesky factorization and estimates the condition number of a complex
Hermitian positive definite band coefficient matrix. R is an upper triangular band matrix.
The L1 condition number of the matrix A is defined to be κ(A) = ∥A ∥1∥A-1∥1. Since it is expensive to compute
∥A-1∥1, the condition number is only estimated. The estimation algorithm is the same as used by LINPACK
and is described by Cline et al. (1979).
LFCQH
Chapter 1: Linear Systems
405
If the estimated condition number is greater than 1/ɛ (where ɛ is machine precision), a warning error is
issued. This indicates that very small changes in A can cause very large changes in the solution x. Iterative
refinement can sometimes find the solution to such a system.
LFCQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A either is very close to a singular matrix or is a matrix which is not positive definite.
The RH R factors are returned in a form that is compatible with routines LFIQH, LFSQH and LFDQH. To solve
systems of equations with multiple right-hand-side vectors, use LFCQH followed by either LFIQH or LFSQH
called once for each right-hand side. The routine LFDQH can be called to compute the determinant of the coefficient matrix after LFCQH has performed the factorization.
LFCQH is based on the LINPACK routine CPBCO; see Dongarra et al. (1979).
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CQH/DL2CQH. The reference is:
CALL L2CQH (N, A, LDA, NCODA, FACT, LDFACT, RCOND, WK)
The additional argument is:
WK — Complex work vector of length N.
2.
Informational errors
Type
Code
Description
3
1
The input matrix is algorithmically singular.
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part
Example
The inverse of a 5 × 5 band Hermitian matrix with one codiagonal is computed. LFCQH is called to factor the
matrix and to check for nonpositive definiteness or ill-conditioning. LFIQH is called to determine the columns of the inverse.
USE
USE
USE
USE
LFCQH_INT
LFIQH_INT
UMACH_INT
WRCRN_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
Declare variables
N, NCODA, LDA, LDFACT, NOUT
(N=5, NCODA=1, LDA=NCODA+1, LDFACT=LDA)
RCOND
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RES(N), RJ(N)
Set values for A in band Hermitian form
A = ( 0.0+0.0i -1.0+1.0i
1.0+2.0i
0.0+4.0i
1.0+1.0i )
LFCQH
Chapter 1: Linear Systems
406
!
!
!
!
!
!
!
!
!
!
!
( 2.0+0.0i
4.0+0.0i 10.0+0.0i
6.0+0.0i
9.0+0.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0), &
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
Factor the matrix A
CALL LFCQH (A, NCODA, FACT, RCOND)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
RJ is the J-th column of the identity
matrix so the following LFIQH
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFIQH (A, NCODA, FACT, RJ, AINV(:,J), RES)
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
CALL WRCRN (’AINV’, AINV)
!
99999 FORMAT (’
END
RCOND = ’,F5.3,/,’
L1 Condition number = ’,F6.3)
Output
RCOND = 0.067
L1 Condition number = 14.961
1
2
3
4
5
1
2
3
4
5
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
LFCQH
Chapter 1: Linear Systems
407
LFTQH
Computes the RH R factorization of a complex Hermitian positive definite matrix in band Hermitian storage
mode.
Required Arguments
A — Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian matrix to be
factored in band Hermitian storage mode. (Input)
NCODA — Number of upper or lower codiagonals of A. (Input)
FACT — Complex NCODA + 1 by N array containing the RH R factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFTQH (A, NCODA, FACT [, …])
Specific:
The specific interface names are S_LFTQH and D_LFTQH.
FORTRAN 77 Interface
Single:
CALL LFTQH (N, A, LDA, NCODA, FACT, LDFACT)
Double:
The double precision name is DLFTQH.
Description
Routine LFTQH computes an RHR Cholesky factorization of a complex Hermitian positive definite band coefficient matrix. R is an upper triangular band matrix.
LFTQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element. These errors
occur only if A either is very close to a singular matrix or is a matrix which is not positive definite.
The RH R factors are returned in a form that is compatible with routines LFIQH, LFSQH and LFDQH. To solve
systems of equations with multiple right-hand-side vectors, use LFTQH followed by either LFIQH or LFSQH
called once for each right-hand side. The routine LFDQH can be called to compute the determinant of the coefficient matrix after LFTQH has performed the factorization.
LFTQH
Chapter 1: Linear Systems
408
LFTQH is based on the LINPACK routine SPBFA; see Dongarra et al. (1979).
Comments
Informational errors
Type
Code
Description
3
4
The input matrix is not Hermitian. It has a diagonal entry with a small imaginary part.
4
2
The input matrix is not positive definite.
4
4
The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Example
The inverse of a 5 × 5 band Hermitian matrix with one codiagonal is computed. LFTQH is called to factor the
matrix and to check for nonpositive definiteness. LFSQH is called to determine the columns of the inverse.
USE LFTQH_INT
USE LFSQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
Set values for A in band Hermitian form
A = ( 0.0+0.0i -1.0+1.0i 1.0+2.0i
( 2.0+0.0i 4.0+0.0i 10.0+0.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
Factor the matrix A
CALL LFTQH (A, NCODA, FACT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0,0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
RJ is the J-th column of the identity
matrix so the following LFSQH
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSQH (FACT, NCODA, RJ, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
Print the results
CALL WRCRN (’AINV’, AINV)
!
END
LFTQH
Chapter 1: Linear Systems
409
Output
AINV
1
2
3
4
5
1
2
3
4
5
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
2
( 0.2166,-0.2166)
( 0.4332, 0.0000)
(-0.0599, 0.1198)
(-0.0829,-0.0415)
( 0.0138,-0.0046)
3
(-0.0899,-0.0300)
(-0.0599,-0.1198)
( 0.1797, 0.0000)
( 0.0000, 0.1244)
(-0.0138,-0.0138)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
LFTQH
Chapter 1: Linear Systems
410
LFSQH
Solves a complex Hermitian positive definite system of linear equations given the factorization of the coefficient matrix in band Hermitian storage mode.
Required Arguments
FACT — Complex NCODA + 1 by N array containing the RH R factorization of the Hermitian positive definite band matrix A. (Input)
FACT is obtained as output from routine LFCQH/DLFCQH or LFTQH/DLFTQH .
NCODA — Number of upper or lower codiagonals of A. (Input)
B — Complex vector of length N containing the right-hand-side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFSQH (FACT, NCODA, B, X [, …])
Specific:
The specific interface names are S_LFSQH and D_LFSQH.
FORTRAN 77 Interface
Single:
CALL LFSQH (N, FACT, LDFACT, NCODA, B, X)
Double:
The double precision name is DLFSQH.
Description
Routine LFSQH computes the solution for a system of linear algebraic equations having a complex Hermitian
positive definite band coefficient matrix. To compute the solution, the coefficient matrix must first undergo
an RH R factorization. This may be done by calling either IMSL routine LFCQH or LFTQH. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
LFSQH and LFIQH both solve a linear system given its RH R factorization. LFIQH generally takes more time
and produces a more accurate answer than LFSQH. Each iteration of the iterative refinement algorithm used
by LFIQH calls LFSQH.
LFSQH
Chapter 1: Linear Systems
411
LFSQH is based on the LINPACK routine CPBSL; see Dongarra et al. (1979).
Comments
Informational error
Type
Code
Description
4
1
The factored matrix has a diagonal element close to zero.
Example
A set of linear systems is solved successively. LFTQH is called to factor the coefficient matrix. LFSQH is called
to compute the three solutions for the three right-hand sides. In this case the coefficient matrix is assumed to
be well-conditioned and correctly scaled. Otherwise, it would be better to call LFCQH to perform the factorization, and LFIQH to compute the solutions.
USE LFSQH_INT
USE LFTQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
A(LDA,N), B(N,3), FACT(LDFACT,N), X(N,3)
Set values for A in band Hermitian form, and B
A = ( 0.0+0.0i -1.0+1.0i 1.0+2.0i
( 2.0+0.0i 4.0+0.0i 10.0+0.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
B = ( 3.0+3.0i
4.0+0.0i
29.0-9.0i )
( 5.0-5.0i 15.0-10.0i -36.0-17.0i )
( 5.0+4.0i -12.0-56.0i -15.0-24.0i )
( 9.0+7.0i -12.0+10.0i -23.0-15.0i )
(-22.0+1.0i
3.0-1.0i -23.0-28.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B/(3.0,3.0), (5.0,-5.0), (5.0,4.0), (9.0,7.0), (-22.0,1.0),&
(4.0,0.0), (15.0,-10.0), (-12.0,-56.0), (-12.0,10.0),&
(3.0,-1.0), (29.0,-9.0), (-36.0,-17.0), (-15.0,-24.0),&
(-23.0,-15.0), (-23.0,-28.0)/
Factor the matrix A
CALL LFTQH (A, NCODA, FACT)
Compute the solutions
DO 10 I=1, 3
CALL LFSQH (FACT, NCODA, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRCRN (’X’, X)
END
LFSQH
Chapter 1: Linear Systems
412
Output
X
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFSQH
Chapter 1: Linear Systems
413
LFIQH
Uses iterative refinement to improve the solution of a complex Hermitian positive definite system of linear
equations in band Hermitian storage mode.
Required Arguments
A — Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian coefficient
matrix in band Hermitian storage mode. (Input)
NCODA — Number of upper or lower codiagonals of A. (Input)
FACT — Complex NCODA + 1 by N array containing the RH R factorization of the matrix A as output from
routine LFCQH/DLFCQH or LFTQH/DLFTQH. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
RES — Complex vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFIQH (A, NCODA, FACT, B, X, RES [, …])
Specific:
The specific interface names are S_LFIQH and D_LFIQH.
FORTRAN 77 Interface
Single:
CALL LFIQH (N, A, LDA, NCODA, FACT, LDFACT, B, X, RES)
Double:
The double precision name is DLFIQH.
Description
Routine LFIQH computes the solution for a system of linear algebraic equations having a complex Hermitian
positive definite band coefficient matrix. To compute the solution, the coefficient matrix must first undergo
an RH R factorization. This may be done by calling either IMSL routine LFCQH or LFTQH. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
LFIQH
Chapter 1: Linear Systems
414
LFSQH and LFIQH both solve a linear system given its RH R factorization. LFIQH generally takes more time
and produces a more accurate answer than LFSQH. Each iteration of the iterative refinement algorithm used
by LFIQH calls LFSQH.
Comments
Informational error
Type
Code
Description
4
1
The factored matrix has a diagonal element close to zero.
Example
A set of linear systems is solved successively. The right-hand side vector is perturbed after solving the system
each of the fisrt two times by adding (1 + i)/2 to the second element.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
RCOND
A(LDA,N), B(N), FACT(LDFACT,N), RES(N,3), X(N,3)
Set values for A in band Hermitian form, and B
A = ( 0.0+0.0i -1.0+1.0i 1.0+2.0i
( 2.0+0.0i 4.0+0.0i 10.0+0.0i
B = (
3.0+3.0i 5.0-5.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
5.0+4.0i 9.0+7.0i -22.0+1.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B/(3.0,3.0), (5.0,-5.0), (5.0,4.0), (9.0,7.0), (-22.0,1.0)/
!
Factor the matrix A
CALL LFCQH (A, NCODA, FACT, RCOND=RCOND)
!
Print the estimated condition number
CALL UMACH (2, NOUT)
WRITE (NOUT, 99999) RCOND, 1.0E0/RCOND
!
Compute the solutions
DO 10 I=1, 3
CALL LFIQH (A, NCODA, FACT, B, X(:,I), RES(:,I))
B(2) = B(2) + (0.5E0, 0.5E0)
10 CONTINUE
!
Print solutions
CALL WRCRN ('X', X)
CALL WRCRN ('RES', RES)
99999 FORMAT (' RCOND = ', F5.3, /, ' L1 Condition number = ', F6.3)
END
Output
X
LFIQH
Chapter 1: Linear Systems
415
1
2
3
4
5
1
( 1.00, 0.00)
( 1.00, -2.00)
( 2.00, 0.00)
( 2.00, 3.00)
( -3.00, 0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFIQH
Chapter 1: Linear Systems
416
LFDQH
Computes the determinant of a complex Hermitian positive definite matrix given the RHR Cholesky factorization in band Hermitian storage mode.
Required Arguments
FACT — Complex NCODA + 1 by N array containing the RHR factorization of the Hermitian positive definite band matrix A. (Input)
FACT is obtained as output from routine LFCQH/DLFCQH or LFTQH/DLFTQH.
NCODA — Number of upper or lower codiagonals of A. (Input)
DET1 — Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 ≤ ∣DET1∣ < 10.0 or DET1 = 0.0.
DET2 — Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det (A) = DET1 * 10DET2.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LFDQH (FACT, NCODA, DET1, DET2 [, …])
Specific:
The specific interface names are S_LFDQH and D_LFDQH.
FORTRAN 77 Interface
Single:
CALL LFDQH (N, FACT, LDFACT, NCODA, DET1, DET2)
Double:
The double precision name is DLFDQH.
Description
Routine LFDQH computes the determinant of a complex Hermitian positive definite band coefficient matrix.
To compute the determinant, the coefficient matrix must first undergo an RH R factorization. This may be
done by calling either LFCQH or LFTQH. The formula det A = det RH det R = (det R)2 is used to compute the
determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
GHW5
š
1
L 5LL
LFDQH is based on the LINPACK routine CPBDI; see Dongarra et al. (1979).
LFDQH
Chapter 1: Linear Systems
417
Example
The determinant is computed for a 5 × 5 complex Hermitian positive definite band matrix with one
codiagonal.
USE LFDQH_INT
USE LFTQH_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, N=5, LDFACT=2, NCODA=1)
DET1, DET2
A(LDA,N), FACT(LDFACT,N)
Set values for A in band Hermitian form
A = ( 0.0+0.0i -1.0+1.0i 1.0+2.0i
( 2.0+0.0i 4.0+0.0i 10.0+0.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
Factor the matrix
CALL LFTQH (A, NCODA, FACT)
Compute the determinant
CALL LFDQH (FACT, NCODA, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT (’ The determinant of A is ’,F6.3,’ * 10**’,F2.0)
END
Output
The determinant of A is
1.736 * 10**3.
LFDQH
Chapter 1: Linear Systems
418
LSLXG
Solves a sparse system of linear algebraic equations by Gaussian elimination.
Required Arguments
A — Vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
IPATH = 2 means the system ATx = b is solved.
Default: IPATH = 1.
IPARAM — Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM.
Default: IPARAM(1) = 0.
See Comment 3.
RPARAM — Parameter vector of length 5. (Input/Output)
See Comment 3.
FORTRAN 90 Interface
Generic:
CALL LSLXG (A, IROW, JCOL, B, X [, …])
Specific:
The specific interface names are S_LSLXG and D_LSLXG.
FORTRAN 77 Interface
Single:
CALL LSLXG (N, NZ, A, IROW, JCOL, B, IPATH, IPARAM, RPARAM, X)
Double:
The double precision name is DLSLXG.
Description
Consider the linear equation
Ax = b
LSLXG
Chapter 1: Linear Systems
419
where A is a n × n sparse matrix. The sparse coordinate format for the matrix A requires one real and two
integer vectors. The real array a contains all the nonzeros in A. Let the number of nonzeros be nz. The two
integer arrays irow and jcol, each of length nz, contain the row and column numbers for these entries in A.
That is
Airow(i),icol(i) = a(i), i = 1, …, nz
with all other entries in A zero.
The routine LSLXG solves a system of linear algebraic equations having a real sparse coefficient matrix. It
first uses the routine LFTXG to perform an LU factorization of the coefficient matrix. The solution of the linear
system is then found using LFSXG.
The routine LFTXG by default uses a symmetric Markowitz strategy (Crowe et al. 1990) to choose pivots that
most likely would reduce fill-ins while maintaining numerical stability. Different strategies are also provided
as options for row oriented or column oriented problems. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LXG/DL2LXG. The reference is:
CALL L2LXG (N, NZ, A, IROW, JCOL, B, IPATH, IPARAM, RPARAM, X, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Real work vector of length LWK.
LWK — The length of WK, LWK should be at least 2N + MAXNZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 17N + 4 * MAXNZ.
MAXNZ is the maximal number of nonzero elements at any stage of the Gaussian elimination. In the
absence of other information, setting MAXNZ equal to 3 * NZ is recommended. Higher or lower values
may be used depending on fill-in. See also IPARAM(5) in Comment 3.
2.
Informational errors
Type
Code
Description
3
1
The coefficient matrix is numerically singular.
3
2
The growth factor is too large to continue.
3
3
The matrix is too ill-conditioned for iterative refinement.
LSLXG
Chapter 1: Linear Systems
420
3.
If the default parameters are desired for LSLXG, then set IPARAM(1) to zero and call the routine LSLXG.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM. then the following steps
should be taken before calling LSLXG.
CALL L4LXG (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LXG will set IPARAM and RPARAM to their default values, so only nondefault values
need to be set above.
IPARAM — Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = The pivoting strategy
IPARAM(2)
Action
1
Markowitz row search
2
Markowitz column search
3
Symmetric Markowitz search
Default: 3.
IPARAM(3) = The number of rows which have least numbers of nonzero elements that will be
searched for a pivotal element.
Default: 3.
IPARAM(4) = The maximal number of nonzero elements in A at any stage of the Gaussian elimination. (Output)
IPARAM(5) = The workspace limit.
IPARAM(5)
Action
0
Default limit. For single precision, 19N + 5 * MAXNZ. For
double precision, 21N + 6 * MAXNZ. See comment 1 for
the definition of MAXNZ.
integer
This integer value replaces the default workspace limit.
When L2LXG is called, the values of LWK and LIWK are used instead of IPARAM(5).
Default: 0.
IPARAM(6) = Iterative refinement is done when this is nonzero.
Default: 0.
RPARAM — Real vector of length 5.
RPARAM(1) = The upper limit on the growth factor. The computation stops when the growth factor
exceeds the limit.
Default: 1016
RPARAM(2) = The stability factor. The absolute value of the pivotal element must be bigger than the
largest element in absolute value in its row divided by RPARAM(2).
Default: 10.0.
LSLXG
Chapter 1: Linear Systems
421
RPARAM(3) = Drop-tolerance. Any element in the lower triangular factor L will be removed if its
absolute value becomes smaller than the drop-tolerance at any stage of the Gaussian elimination.
Default: 0.0.
RPARAM(4) = The growth factor. It is calculated as the largest element in absolute value in A at any
stage of the Gaussian elimination divided by the largest element in absolute value in the original A matrix. (Output)
Large value of the growth factor indicates that an appreciable error in the computed solution
is possible.
RPARAM(5) = The value of the smallest pivotal element in absolute value. (Output)
If double precision is required, then DL4LXG is called and RPARAM is declared double precision.
Example
As an example consider the 6× 6 linear system:
$
í í í í í í í
í í Let xT = (1, 2, 3, 4, 5, 6) so that Ax = (10, 7, 45, 33, −34, 31)T. The number of nonzeros in A is nz = 15. The
sparse coordinate form for A is given by:
LURZ MFRO D í í í í í í í í í
USE LSLXG_INT
USE WRRRN_INT
USE L4LXG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
REAL
IPARAM(6), IROW(NZ), JCOL(NZ)
A(NZ), B(N), RPARAM(5), X(N)
!
DATA A/6., 10., 15., -3., 10., -1., -1., -3., -5., 1., 10., -1.,&
-2., -1., -2./
DATA B/10., 7., 45., 33., -34., 31./
DATA IROW/6, 2, 3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
DATA JCOL/6, 2, 3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
!
!
Change a default parameter
LSLXG
Chapter 1: Linear Systems
422
CALL L4LXG (IPARAM, RPARAM)
IPARAM(5) = 203
!
Solve for X
CALL LSLXG (A, IROW, JCOL, B, X, IPARAM=IPARAM)
!
CALL WRRRN (’ x ’, X, 1, N, 1)
END
Output
x
1
1.000
2
2.000
3
3.000
4
4.000
5
5.000
6
6.000
LSLXG
Chapter 1: Linear Systems
423
LFTXG
Computes the LU factorization of a real general sparse matrix..
Required Arguments
A — Vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in A. (Input)
NL — The number of nonzero coefficients in the triangular matrix L excluding the diagonal elements.
(Output)
NFAC — On input, the dimension of vector FACT. (Input/Output)
On output, the number of nonzero coefficients in the triangular matrix L and U.
FACT — Vector of length NFAC containing the nonzero elements of L (excluding the diagonals) in the first
NL locations and the nonzero elements of U in NL + 1 to NFAC locations. (Output)
IRFAC — Vector of length NFAC containing the row numbers of the corresponding elements in FACT.
(Output)
JCFAC — Vector of length NFAC containing the column numbers of the corresponding elements in FACT.
(Output)
IPVT — Vector of length N containing the row pivoting information for the LU factorization. (Output)
JPVT — Vector of length N containing the column pivoting information for the LU factorization. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (IPVT,1).
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPARAM — Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM.
Default: IPARAM(1) = 0.
See Comment 3.
RPARAM — Parameter vector of length 5. (Input/Output)
See Comment 3.
FORTRAN 90 Interface
Generic:
CALL LFTXG (A, IROW, JCOL, NL, NFAC, FACT, IRFAC, JCFAC, IPVT, JPVT [, …])
Specific:
The specific interface names are S_LFTXG and D_LFTXG.
FORTRAN 77 Interface
Single:
CALL LFTXG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT, IRFAC, JCFAC,
IPVT, JPVT)
Double:
The double precision name is DLFTXG.
LFTXG
Chapter 1: Linear Systems
424
Description
Consider the linear equation
Ax = b
where A is a n × n sparse matrix. The sparse coordinate format for the matrix A requires one real and two
integer vectors. The real array a contains all the nonzeros in A. Let the number of nonzeros be nz. The two
integer arrays irow and jcol, each of length nz, contain the row and column numbers for these entries in A.
That is
Airow(i), icol(i) = a(i), i = 1, …, nz
with all other entries in A zero.
The routine LFTXG performs an LU factorization of the coefficient matrix A. It by default uses a symmetric
Markowitz strategy (Crowe et al. 1990) to choose pivots that most likely would reduce fillins while maintaining numerical stability. Different strategies are also provided as options for row oriented or column oriented
problems. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained using LFSXG by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.Workspace may be explicitly provided, if desired, by use of L2TXG/DL2TXG. The reference is:
CALL L2TXG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT, IRFAC, JCFAC, IPVT,
JPVT, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Real work vector of length LWK.
LWK — The length of WK, LWK should be at least MAXNZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 15N + 4 * MAXNZ.
The workspace limit is determined by MAXNZ, where
MAXNZ = MIN0(LWK, INT(0.25(LIWK-15N)))
2.
Informational errors
Type
Code
Description
3
1
The coefficient matrix is numerically singular.
3
2
The growth factor is too large to continue.
LFTXG
Chapter 1: Linear Systems
425
3.
If the default parameters are desired for LFTXG, then set IPARAM(1) to zero and call the routine LFTXG.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM, then the following steps
should be taken before calling LFTXG.
CALL L4LXG (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LXG will set IPARAM and RPARAM to their default values, so only nondefault values
need to be set above.
The arguments are as follows:
IPARAM — Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = The pivoting strategy.
IPARAM(2)
Action
1
Markowitz row search
2
Markowitz column search
3
Symmetric Markowitz search
Default: 3.
IPARAM(3) = The number of rows which have least numbers of nonzero elements that will be
searched for a pivotal element.
Default: 3.
IPARAM(4) = The maximal number of nonzero elements in A at any stage of the Gaussian elimination. (Output)
IPARAM(5) = The workspace limit.
IPARAM(5)
Action
0
Default limit, see Comment 1.
integer
This integer value replaces the default workspace limit.
When L2TXG is called, the values of LWK and LIWK are used instead of IPARAM(5).
IPARAM(6) = Not used in LFTXG.
RPARAM — Real vector of length 5.
RPARAM(1) = The upper limit on the growth factor. The computation stops when the growth factor
exceeds the limit.
Default: 10.
RPARAM(2) = The stability factor. The absolute value of the pivotal element must be bigger than the
largest element in absolute value in its row divided by RPARAM(2).
Default: 10.0.
LFTXG
Chapter 1: Linear Systems
426
RPARAM(3) = Drop-tolerance. Any element in the lower triangular factor L will be removed if its
absolute value becomes smaller than the drop-tolerance at any stage of the Gaussian elimination.
Default: 0.0.
RPARAM(4) = The growth factor. It is calculated as the largest element in absolute value in A at any
stage of the Gaussian elimination divided by the largest element in absolute value in the original A matrix. (Output)
Large value of the growth factor indicates that an appreciable error in the computed solution
is possible.
RPARAM(5) = The value of the smallest pivotal element in absolute value. (Output)
If double precision is required, then DL4LXG is called and RPARAM is declared double precision.
Example
As an example, consider the 6 × 6 matrix of a linear system:
$
í í í í í í í
í í The sparse coordinate form for A is given by:
LURZ MFRO D í í í í í í í í í
USE LFTXG_INT
USE WRRRN_INT
USE WRIRN_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
INTEGER
IROW(NZ), JCOL(NZ), NFAC, NL,&
IRFAC(3*NZ), JCFAC(3*NZ), IPVT(N), JPVT(N)
REAL
A(NZ), FACT(3*NZ)
!
DATA A/6., 10.,
-2., -1.,
DATA IROW/6, 2,
DATA JCOL/6, 2,
15., -3., 10., -1., -1., -3., -5., 1., 10., -1.,&
-2./
3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
!
NFAC = 3*NZ
!
Use default options
CALL LFTXG (A, IROW, JCOL, NL, NFAC, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
CALL WRRRN (’ fact ’, FACT, 1, NFAC, 1)
CALL WRIRN (’ irfac ’, IRFAC, 1, NFAC, 1)
LFTXG
Chapter 1: Linear Systems
427
CALL WRIRN (’ jcfac ’, JCFAC, 1, NFAC, 1)
CALL WRIRN (’ p ’, IPVT, 1, N, 1)
CALL WRIRN (’ q ’, JPVT, 1, N, 1)
!
END
Output
fact
1
-0.10
11
-1.00
2
-5.00
12
30.00
3
-0.20
13
6.00
4
-0.10
14
-2.00
5
-0.10
15
10.00
6
-1.00
16
15.00
7
-0.20
8
4.90
1
3
2
4
3
4
4
5
5
5
6
6
7
6
8
6
irfac
9 10
5
5
11
4
12
4
13
3
14
3
15
2
16
1
1
2
2
3
3
1
4
4
5
2
6
5
7
2
8
6
jcfac
9 10
6
5
11
6
12
4
13
4
14
3
15
2
16
1
1
3
2
1
3
6
p
4
2
5
5
6
4
1
3
2
1
3
2
q
4
6
5
5
6
4
9
-5.10
LFTXG
10
1.00
Chapter 1: Linear Systems
428
LFSXG
Solves a sparse system of linear equations given the LU factorization of the coefficient matrix..
Required Arguments
NFAC — The number of nonzero coefficients in FACT as output from subroutine LFTXG/DLFTXG. (Input)
NL — The number of nonzero coefficients in the triangular matrix L excluding the diagonal elements as
output from subroutine LFTXG/DLFTXG. (Input)
FACT — Vector of length NFAC containing the nonzero elements of L (excluding the diagonals) in the first
NL locations and the nonzero elements of U in NL + 1 to NFAC locations as output from subroutine
LFTXG/DLFTXG. (Input)
IRFAC — Vector of length NFAC containing the row numbers of the corresponding elements in FACT as
output from subroutine LFTXG/DLFTXG. (Input)
JCFAC — Vector of length NFAC containing the column numbers of the corresponding elements in FACT as
output from subroutine LFTXG/DLFTXG. (Input)
IPVT — Vector of length N containing the row pivoting information for the LU factorization as output from
subroutine LFTXG/DLFTXG. (Input)
JPVT — Vector of length N containing the column pivoting information for the LU factorization as output
from subroutine LFTXG/DLFTXG. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system Ax = B is solved.
IPATH = 2 means the system ATx = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSXG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, X [, …])
Specific:
The specific interface names are S_LFSXG and D_LFSXG.
FORTRAN 77 Interface
Single:
CALL LFSXG (N, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, IPATH, X)
Double:
The double precision name is DLFSXG.
Description
Consider the linear equation
LFSXG
Chapter 1: Linear Systems
429
Ax = b
where A is a n × n sparse matrix. The sparse coordinate format for the matrix A requires one real and two
integer vectors. The real array a contains all the nonzeros in A. Let the number of nonzeros be nz. The two
integer arrays irow and jcol, each of length nz, contain the row and column numbers for these entries in A.
That is
Airow(i), icol(i) = a(i), i = 1, …, nz
with all other entries in A zero. The routine LFSXG computes the solution of the linear equation given its LU
factorization. The factorization is performed by calling LFTXG. The solution of the linear system is then
found by the forward and backward substitution. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively. Finally, the solution x is obtained
by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
For more details, see Crowe et al. (1990).
Example
As an example, consider the 6 × 6 linear system:
$
í í í í í í í
í í Let
x1T = (1,2,3,4,5,6)
so that Ax1 = (10, 7, 45, 33, −34, 31)T, and
x2T = (6,5,4,3,2,1)
so that Ax2 = (60, 35, 60, 16, −22, 10)T. The sparse coordinate form for A is given by:
LFSXG
Chapter 1: Linear Systems
430
LURZ MFRO D í í í í í í í í í
USE LFSXG_INT
USE WRRRL_INT
USE LFTXG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
INTEGER
IPATH, IROW(NZ), JCOL(NZ), NFAC,&
NL, IRFAC(3*NZ), JCFAC(3*NZ), IPVT(N), JPVT(N)
REAL
X(N), A(NZ), B(N,2), FACT(3*NZ)
CHARACTER TITLE(2)*2, RLABEL(1)*4, CLABEL(1)*6
DATA RLABEL(1)/’NONE’/, CLABEL(1)/’NUMBER’/
!
DATA A/6., 10., 15., -3., 10., -1., -1., -3., -5., 1., 10., -1.,&
-2., -1., -2./
DATA B/10., 7., 45., 33., -34., 31.,&
60., 35., 60., 16., -22., -10./
DATA IROW/6, 2, 3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
DATA JCOL/6, 2, 3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
DATA TITLE/’x1’, ’x2’/
!
NFAC = 3*NZ
!
Perform LU factorization
CALL LFTXG (A, IROW, JCOL, NL, NFAC, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
DO 10 I = 1, 2
!
Solve A * X(i) = B(i)
CALL LFSXG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B(:,I), X)
!
CALL WRRRL (TITLE(I), X,
10 CONTINUE
END
RLABEL, CLABEL, 1, N, 1)
Output
x1
1
1.0
2
2.0
3
3.0
1
6.0
2
5.0
3
4.0
4
4.0
5
5.0
6
6.0
5
2.0
6
1.0
x2
4
3.0
LFSXG
Chapter 1: Linear Systems
431
LSLZG
Solves a complex sparse system of linear equations by Gaussian elimination.
Required Arguments
A — Complex vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
IPATH = 2 means the system AH x = b is solved.
Default: IPATH =1.
IPARAM — Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 3.
Default: IPARAM = 0.
RPARAM — Parameter vector of length 5. (Input/Output)
See Comment 3
FORTRAN 90 Interface
Generic:
CALL LSLZG (A, IROW, JCOL, B, X [, …])
Specific:
The specific interface names are S_LSLZG and D_LSLZG.
FORTRAN 77 Interface
Single:
CALL LSLZG (N, NZ, A, IROW, JCOL, B, IPATH, IPARAM, RPARAM, X)
Double:
The double precision name is DLSLZG.
Description
Consider the linear equation
Ax = b
LSLZG
Chapter 1: Linear Systems
432
where A is a n × n complex sparse matrix. The sparse coordinate format for the matrix A requires one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the number of nonzeros
be nz. The two integer arrays irow and jcol, each of length nz, contain the row and column numbers for
these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
with all other entries in A zero.
The subroutine LSLZG solves a system of linear algebraic equations having a complex sparse coefficient
matrix. It first uses the routine LFTZG to perform an LU factorization of the coefficient matrix. The solution of
the linear system is then found using LFSZG. The routine LFTZG by default uses a symmetric Markowitz strategy (Crowe et al. 1990) to choose pivots that most likely would reduce fill-ins while maintaining numerical
stability. Different strategies are also provided as options for row oriented or column oriented problems. The
algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively. Finally, the solution x is obtained
by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LZG/DL2LZG. The reference is:
CALL L2LZG (N, NZ, A, IROW, JCOL, B, IPATH, IPARAM, RPARAM, X, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Complex work vector of length LWK.
LWK — The length of WK, LWK should be at least 2N+ MAXNZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 17N + 4 * MAXNZ.
The workspace limit is determined by MAXNZ, where
MAXNZ = MIN0(LWK-2N, INT(0.25(LIWK-17N)))
2.
3.
Informational errors
Type
Code
Description
3
1
The coefficient matrix is numerically singular.
3
2
The growth factor is too large to continue.
3
3
The matrix is too ill-conditioned for iterative refinement.
If the default parameters are desired for LSLZG, then set IPARAM(1) to zero and call the routine LSLZG.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM. then the following steps
should be taken before calling LSLZG.
LSLZG
Chapter 1: Linear Systems
433
CALL L4LZG (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LZG will set IPARAM and RPARAM to their default values, so only nondefault values
need to be set above. The arguments are as follows:
IPARAM — Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = The pivoting strategy.
IPARAM(2)
Action
1
Markowitz row search
2
Markowitz column search
3
Symmetric Markowitz search
Default: 3.
IPARAM(3) = The number of rows which have least numbers of nonzero elements that will be
searched for a pivotal element.
Default: 3.
IPARAM(4) = The maximal number of nonzero elements in A at any stage of the Gaussian elimination. (Output)
IPARAM(5) = The workspace limit.
IPARAM(5)
Action
0
Default limit, see Comment 1.
integer
This integer value replaces the default
workspace limit.
When L2LZG is called, the values of LWK and LIWK are used instead of IPARAM(5).
Default: 0.
IPARAM(6) = Iterative refinement is done when this is nonzero.
Default: 0.
RPARAM — Real vector of length 5.
RPARAM(1) = The upper limit on the growth factor. The computation stops when the growth factor
exceeds the limit.
Default: 10.
RPARAM(2) = The stability factor. The absolute value of the pivotal element must be bigger than the
largest element in absolute value in its row divided by RPARAM(2).
Default: 10.0.
RPARAM(3) = Drop-tolerance. Any element in A will be removed if its absolute value becomes
smaller than the drop-tolerance at any stage of the Gaussian elimination.
Default: 0.0.
LSLZG
Chapter 1: Linear Systems
434
RPARAM(4) = The growth factor. It is calculated as the largest element in absolute value in A at any
stage of the Gaussian elimination divided by the largest element in absolute value in the original A matrix. (Output)
Large value of the growth factor indicates that an appreciable error in the computed solution
is possible.
RPARAM(5) = The value of the smallest pivotal element in absolute value. (Output)
If double precision is required, then DL4LZG is called and RPARAM is declared double precision.
Example
As an example, consider the 6 × 6 linear system:
$
L
L í L í L
L
í í L
L í L
í L
í L L í L
í L í L
L
Let
xT = (1 + i, 2 + 2i, 3 + 3i, 4 + 4i, 5 + 5i, 6 + 6i)
so that
Ax = (3 + 17i, −19 + 5i, 6 + 18i, −38 + 32i, −63 + 49i, −57 + 83i)T
The number of nonzeros in A is nz = 15. The sparse coordinate form for A is given by:
LURZ MFRO USE LSLZG_INT
USE WRCRN_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
COMPLEX
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
!
DATA A/(3.0,7.0), (3.0,2.0), (-3.0,0.0), (-1.0,3.0), (4.0,2.0),&
(10.0,7.0), (-5.0,4.0), (1.0,6.0), (-1.0,12.0), (-5.0,0.0),&
(12.0,2.0), (-2.0,8.0), (-2.0,-4.0), (-1.0,2.0), (-7.0,7.0)/
DATA B/(3.0,17.0), (-19.0,5.0), (6.0,18.0), (-38.0,32.0),&
(-63.0,49.0), (-57.0,83.0)/
DATA IROW/6, 2, 2, 4, 3, 1, 5, 4, 6, 5, 5, 6, 4, 2, 5/
DATA JCOL/6, 2, 3, 5, 3, 1, 1, 4, 1, 4, 5, 2, 1, 4, 6/
!
!
Use default options
CALL LSLZG (A, IROW, JCOL, B, X)
!
LSLZG
Chapter 1: Linear Systems
435
CALL WRCRN (’X’, X)
END
Output
X
1
2
3
4
5
6
(
(
(
(
(
(
1.000,
2.000,
3.000,
4.000,
5.000,
6.000,
1.000)
2.000)
3.000)
4.000)
5.000)
6.000)
LSLZG
Chapter 1: Linear Systems
436
LFTZG
Computes the LU factorization of a complex general sparse matrix.
Required Arguments
A — Complex vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in A. (Input)
NFAC — On input, the dimension of vector FACT. (Input/Output)
On output, the number of nonzero coefficients in the triangular matrix L and U.
NL — The number of nonzero coefficients in the triangular matrix L excluding the diagonal elements.
(Output)
FACT — Complex vector of length NFAC containing the nonzero elements of L (excluding the diagonals) in
the first NL locations and the nonzero elements of U in NL + 1 to NFAC locations. (Output)
IRFAC — Vector of length NFAC containing the row numbers of the corresponding elements in FACT.
(Output)
JCFAC — Vector of length NFAC containing the column numbers of the corresponding elements in FACT.
(Output)
IPVT — Vector of length N containing the row pivoting information for the LU factorization. (Output)
JPVT — Vector of length N containing the column pivoting information for the LU factorization. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (IPVT,1).
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPARAM — Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 3.
Default: IPARAM = 0.
RPARAM — Parameter vector of length 5. (Input/Output)
See Comment 3.
FORTRAN 90 Interface
Generic:
CALL LFTZG (A, IROW, JCOL, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT [, …])
Specific:
The specific interface names are S_LFTZG and D_LFTZG.
FORTRAN 77 Interface
Single:
CALL LFTZG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT, IRFAC, JCFAC,
IPVT, JPVT)
Double:
The double precision name is DLFTZG.
LFTZG
Chapter 1: Linear Systems
437
Description
Consider the linear equation
Ax = b
where A is a complex n × n sparse matrix. The sparse coordinate format for the matrix A requires one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the number of nonzeros
be nz. The two integer arrays irow and jcol, each of length nz, contain the row and column indices for
these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
with all other entries in A zero.
The routine LFTZG performs an LU factorization of the coefficient matrix A. It uses by default a symmetric
Markowitz strategy (Crowe et al. 1990) to choose pivots that most likely would reduce fill-ins while maintaining numerical stability. Different strategies are also provided as options for row oriented or column oriented
problems. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained using LFSZG by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2TZG/DL2TZG. The reference is:
CALL L2TZG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT, IRFAC, JCFAC, IPVT,
JPVT, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Complex work vector of length LWK.
LWK — The length of WK, LWK should be at least MAXNZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 15N + 4 * MAXNZ.
The workspace limit is determined by MAXNZ, where
MAXNZ = MIN0(LWK, INT(0.25(LIWK-15N)))
2.
Informational errors
Type
Code
Description
3
1
The coefficient matrix is numerically singular.
3
2
The growth factor is too large to continue.
LFTZG
Chapter 1: Linear Systems
438
3.
If the default parameters are desired for LFTZG, then set IPARAM(1) to zero and call the routine LFTZG.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM. then the following steps
should be taken before calling LFTZG:
CALL L4LZG (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LZG will set IPARAM and RPARAM to their default values so only nondefault values
need to be set above. The arguments are as follows:
IPARAM — Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = The pivoting strategy.
IPARAM(2)
Action
1
Markowitz row search
2
Markowitz column search
3
Symmetric Markowitz search
IPARAM(3) = The number of rows which have least numbers of nonzero elements that will be
searched for a pivotal element.
Default: 3.
IPARAM(4) = The maximal number of nonzero elements in A at any stage of the Gaussian elimination. (Output)
IPARAM(5) = The workspace limit.
IPARAM(5)
Action
0
Default limit, see Comment 1.
integer
This integer value replaces the default workspace limit. When L2TZG is called, the values
of LWK and LIWK are used instead of
IPARAM(5).
Default: 0.
IPARAM(6) = Not used in LFTZG.
RPARAM — Real vector of length 5.
RPARAM(1) = The upper limit on the growth factor. The computation stops when the growth factor
exceeds the limit.
Default: 10.
RPARAM(2) = The stability factor. The absolute value of the pivotal element must be bigger than the
largest element in absolute value in its row divided by RPARAM(2).
Default: 10.0.
LFTZG
Chapter 1: Linear Systems
439
RPARAM(3) = Drop-tolerance. Any element in the lower triangular factor L will be removed if its
absolute value becomes smaller than the drop-tolerance at any stage of the Gaussian
elimination.
Default: 0.0.
RPARAM(4) = The growth factor. It is calculated as the largest element in absolute value in A at any
stage of the Gaussian elimination divided by the largest element in absolute value in the original A matrix. (Output)
Large value of the growth factor indicates that an appreciable error in the computed solution
is possible.
RPARAM(5) = The value of the smallest pivotal element in absolute value. (Output)
If double precision is required, then DL4LZG is called and RPARAM is declared double precision.
Example
As an example, the following 6 × 6 matrix is factorized, and the outcome is printed:
$
L
L í L í L
L
í í L
L í L
í L
í L L í L
í L í L
L
The sparse coordinate form for A is given by:
LURZ MFRO USE LFTZG_INT
USE WRCRN_INT
USE WRIRN_INT
INTEGER
N, NFAC, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
COMPLEX
SPECIFICATIONS FOR LOCAL VARIABLES
IPVT(N), IRFAC(45), IROW(NZ), JCFAC(45),&
JCOL(NZ), JPVT(N), NL
A(NZ), FAC(45)
!
!
DATA A/(3.0,7.0), (3.0,2.0), (-3.0,0.0), (-1.0,3.0), (4.0,2.0),&
(10.0,7.0), (-5.0,4.0), (1.0,6.0), (-1.0,12.0), (-5.0,0.0),&
(12.0,2.0), (-2.0,8.0), (-2.0,-4.0), (-1.0,2.0), (-7.0,7.0)/
DATA IROW/6, 2, 2, 4, 3, 1, 5, 4, 6, 5, 5, 6, 4, 2, 5/
DATA JCOL/6, 2, 3, 5, 3, 1, 1, 4, 1, 4, 5, 2, 1, 4, 6/
DATA NFAC/45/
Use default options
CALL LFTZG (A, IROW, JCOL, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
CALL WRCRN (’fact’,FACT, 1, NFAC, 1)
CALL WRIRN (’ irfac ’,IRFAC, 1, NFAC, 1)
LFTZG
Chapter 1: Linear Systems
440
CALL WRIRN (’ jcfac ’,JCFAC, 1, NFAC, 1)
CALL WRIRN (’ p ’,IPVT, 1, N, 1)
CALL WRIRN (’ q ’,JPVT, 1, N, 1)
!
END
Output
fact
0.50, 0.85)
0.15, -0.41)
-0.60, 0.30)
2.23, -1.97)
-0.15, 0.50)
-0.04, 0.26)
-0.32, -0.17)
-0.92, 7.46)
-6.71, -6.42)
12.00, 2.00)
-1.00, 2.00)
-3.32, 0.21)
3.00, 7.00)
-2.00, 8.00)
10.00, 7.00)
4.00, 2.00)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
1
3
2
4
3
4
4
5
5
5
6
6
7
6
8
6
irfac
9 10
5
5
11
4
12
4
13
3
14
3
15
2
16
1
1
2
2
3
3
1
4
4
5
2
6
5
7
2
8
6
jcfac
9 10
6
5
11
6
12
4
13
4
14
3
15
2
16
1
1
3
2
1
3
6
p
4
2
5
5
6
4
1
3
2
1
3
2
q
4
6
5
5
6
4
LFTZG
Chapter 1: Linear Systems
441
LFSZG
Solves a complex sparse system of linear equations given the LU factorization of the coefficient matrix.
Required Arguments
NFAC — The number of nonzero coefficients in FACT as output from subroutine LFTZG/DLFTZG. (Input)
NL — The number of nonzero coefficients in the triangular matrix L excluding the diagonal elements as
output from subroutine LFTZG/DLFTZG. (Input)
FACT — Complex vector of length NFAC containing the nonzero elements of L (excluding the diagonals) in
the first NL locations and the nonzero elements of U in NL+ 1 to NFAC locations as output from subroutine LFTZG/DLFTZG. (Input)
IRFAC — Vector of length NFAC containing the row numbers of the corresponding elements in FACT as
output from subroutine LFTZG/DLFTZG. (Input)
JCFAC — Vector of length NFAC containing the column numbers of the corresponding elements in FACT as
output from subroutine LFTZG/DLFTZG. (Input)
IPVT — Vector of length N containing the row pivoting information for the LU factorization as output from
subroutine LFTZG/DLFTZG. (Input)
JPVT — Vector of length N containing the column pivoting information for the LU factorization as output
from subroutine LFTZG/DLFTZG. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
IPATH — Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
IPATH = 2 means the system AH x = b is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSZG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, X [, …])
Specific:
The specific interface names are S_LFSZG and D_LFSZG.
FORTRAN 77 Interface
Single:
CALL LFSZG (N, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, IPATH, X)
Double:
The double precision name is DLFSZG.
Description
Consider the linear equation
LFSZG
Chapter 1: Linear Systems
442
Ax = b
where A is a complex n × n sparse matrix. The sparse coordinate format for the matrix A requires one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the number of nonzeros
be nz. The two integer arrays irow and jcol, each of length nz, contain the row and column numbers for
these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
with all other entries in A zero.
The routine LFSZG computes the solution of the linear equation given its LU factorization. The factorization
is performed by calling LFTZG. The solution of the linear system is then found by the forward and backward
substitution. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz strategy (Duff et
al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
For more details, see Crowe et al. (1990).
Example
As an example, consider the 6 × 6 linear system:
$
L
L í L í L
L
í í L
L í L
í L
í L L í L
í L í L
L
Let
x1T = (1+i, 2+2i, 3+3i, 4+4i, 5+5i, 6+6i)
so that
Ax1 = (3 + 17i, −19 + 5i, 6 + 18i, −38 + 32i, −63 + 49i, −57 + 83i)T
and
x2T = (6+6i, 5+5i, 4+4i, 3+3i, 2+2i, 1+i)
so that
LFSZG
Chapter 1: Linear Systems
443
Ax2 = (18 + 102i, −16 + 16i, 8 + 24i, −11 −11i, −63 + 7i, −132 + 106i)T
The sparse coordinate form for A is given by:
LURZ MFRO USE LFSZG_INT
USE WRCRN_INT
USE LFTZG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
COMPLEX
CHARACTER
IPATH, IPVT(N), IRFAC(3*NZ), IROW(NZ),&
JCFAC(3*NZ), JCOL(NZ), JPVT(N), NFAC, NL
A(NZ), B(N,2), FACT(3*NZ), X(N)
TITLE(2)*2
!
DATA A/(3.0,7.0), (3.0,2.0), (-3.0,0.0), (-1.0,3.0), (4.0,2.0),&
(10.0,7.0), (-5.0,4.0), (1.0,6.0), (-1.0,12.0), (-5.0,0.0),&
(12.0,2.0), (-2.0,8.0), (-2.0,-4.0), (-1.0,2.0), (-7.0,7.0)/
DATA B/(3.0,17.0), (-19.0,5.0), (6.0,18.0), (-38.0,32.0),&
(-63.0,49.0), (-57.0,83.0), (18.0,102.0), (-16.0,16.0),&
(8.0,24.0), (-11.0,-11.0), (-63.0,7.0), (-132.0,106.0)/
DATA IROW/6, 2, 2, 4, 3, 1, 5, 4, 6, 5, 5, 6, 4, 2, 5/
DATA JCOL/6, 2, 3, 5, 3, 1, 1, 4, 1, 4, 5, 2, 1, 4, 6/
DATA TITLE/’x1’,’x2’/
!
NFAC = 3*NZ
!
Perform LU factorization
CALL LFTZG (A, IROW, JCOL, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
IPATH = 1
DO 10 I = 1,2
!
Solve A * X(i) = B(i)
CALL LFSZG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT,&
B(:,I), X)
CALL WRCRN (TITLE(I), X)
10 CONTINUE
!
END
Output
1
2
3
4
5
6
(
(
(
(
(
(
x1
1.000, 1.000)
2.000, 2.000)
3.000, 3.000)
4.000, 4.000)
5.000, 5.000)
6.000, 6.000)
x2
LFSZG
Chapter 1: Linear Systems
444
1
2
3
4
5
6
(
(
(
(
(
(
6.000,
5.000,
4.000,
3.000,
2.000,
1.000,
6.000)
5.000)
4.000)
3.000)
2.000)
1.000)
LFSZG
Chapter 1: Linear Systems
445
LSLXD
Solves a sparse system of symmetric positive definite linear algebraic equations by Gaussian elimination.
Required Arguments
A — Vector of length NZ containing the nonzero coefficients in the lower triangle of the linear system.
(Input)
The sparse matrix has nonzeroes only in entries (IROW (i), JCOL(i)) for i = 1 to NZ, and at this location
the sparse matrix has value A(i).
IROW — Vector of length NZ containing the row numbers of the corresponding elements in the lower triangle of A. (Input)
Note IROW(i) ≥ JCOL(i), since we are only indexing the lower triangle.
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in the lower
triangle of A. (Input)
B — Vector of length N containing the right-hand side of the linear system. (Input)
X — Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
NZ — The number of nonzero coefficients in the lower triangle of the linear system. (Input)
Default: NZ = size (A,1).
ITWKSP — The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LSLXD (A, IROW, JCOL, B, X [, …])
Specific:
The specific interface names are S_LSLXD and D_LSLXD.
FORTRAN 77 Interface
Single:
CALL LSLXD (N, NZ, A, IROW, JCOL, B, ITWKSP, X)
Double:
The double precision name is DLSLXD.
Description
Consider the linear equation
Ax = b
LSLXD
Chapter 1: Linear Systems
446
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A requires
one real and two integer vectors. The real array a contains all the nonzeros in the lower triangle of A including
the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz,
contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LSLXD solves a system of linear algebraic equations having a real, sparse and positive definite
coefficient matrix. It first uses the routine LSCXD to compute a symbolic factorization of a permutation of the
coefficient matrix. It then calls LNFXD to perform the numerical factorization. The solution of the linear system is then found using LFSXD.
The routine LSCXD computes a minimum degree ordering or uses a user-supplied ordering to set up the
sparse data structure for the Cholesky factor, L. Then the routine LNFXD produces the numerical entries in L
so that we have
P APT= LLT
Here P is the permutation matrix determined by the ordering.
The numerical computations can be carried out in one of two ways. The first method performs the factorization using a multifrontal technique. This option requires more storage but in certain cases will be faster. The
multifrontal method is based on the routines in Liu (1987). For detailed description of this method, see Liu
(1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and Liu (1981). This is just the standard factorization method based
on the sparse compressed storage scheme.
Finally, the solution x is obtained by the following calculations:
1) Ly1 = Pb
2) LTy2 = y1
3) x = PTy2
The routine LFSXD accepts b and the permutation vector which determines P. It then returns x.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LXD/DL2LXD. The reference is:
CALL L2LXD (N, NZ, A, IROW, JCOL, B, X, IPER, IPARAM, RPARAM, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
IPER — Vector of length N containing the ordering.
IPARAM — Integer vector of length 4. See Comment 3.
RPARAM — Real vector of length 2. See Comment 3.
WK — Real work vector of length LWK.
LWK — The length of WK, LWK should be at least 2N + 6NZ.
LSLXD
Chapter 1: Linear Systems
447
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 15N + 15NZ + 9.
Note that the parameter ITWKSP is not an argument to this routine.
2.
3.
Informational errors
Type
Code
Description
4
1
The coefficient matrix is not positive definite.
4
2
A column without nonzero elements has been found in the coefficient
matrix.
If the default parameters are desired for L2LXD, then set IPARAM(1) to zero and call the routine L2LXD.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM, then the following steps
should be taken before calling L2LXD.
CALL L4LXD (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LXD will set IPARAM and RPARAM to their default values, so only nondefault values
need to be set above. The arguments are as follows:
IPARAM — Integer vector of length 4.
IPARAM(1) = Initialization flag.
IPARAM(2) = The numerical factorization method.
IPARAM(2)
Action
0
Multifrontal
1
Markowitz column search
Default: 0.
IPARAM(3) = The ordering option.
IPARAM(3)
Action
0
Minimum degree ordering
1
User’s ordering specified in IPER
Default: 0.
IPARAM(4) = The total number of nonzeros in the factorization matrix.
RPARAM — Real vector of length 2.
RPARAM(1) = The value of the largest diagonal element in the Cholesky factorization.
RPARAM(2) = The value of the smallest diagonal element in the Cholesky factorization.
If double precision is required, then DL4LXD is called and RPARAM is declared double precision.
Example
As an example consider the 5× 5 linear system:
LSLXD
Chapter 1: Linear Systems
448
$
Let xT = (1, 2, 3, 4, 5) so that Ax = (23, 55, 107, 197, 278)T. The number of nonzeros in the lower triangle of A is
nz = 10. The sparse coordinate form for the lower triangle of A is given by:
LURZ MFRO D or equivalently by
LURZ MFRO D USE LSLXD_INT
USE WRRRN_INT
INTEGER
N, NZ
PARAMETER (N=5, NZ=10)
!
INTEGER
REAL
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
!
DATA
DATA
DATA
DATA
!
!
A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
B/23., 55., 107., 197., 278./
IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Solve A * X = B
CALL LSLXD (A, IROW, JCOL, B, X)
Print results
CALL WRRRN (’ x ’, X, 1, N, 1)
END
Output
x
1
1.000
2
2.000
3
3.000
4
4.000
5
5.000
LSLXD
Chapter 1: Linear Systems
449
LSCXD
Performs the symbolic Cholesky factorization for a sparse symmetric matrix using a minimum degree ordering or a user-specified ordering, and set up the data structure for the numerical Cholesky factorization
Required Arguments
IROW — Vector of length NZ containing the row subscripts of the nonzeros in the lower triangular part of
the matrix including the nonzeros on the diagonal. (Input)
JCOL — Vector of length NZ containing the column subscripts of the nonzeros in the lower triangular part
of the matrix including the nonzeros on the diagonal. (Input)
(IROW (K), JCOL(K)) gives the row and column indices of the k-th nonzero element of the matrix stored
in coordinate form. Note, IROW(K) ≥ JCOL(K).
NZSUB — Vector of length MAXSUB containing the row subscripts for the off-diagonal nonzeros in the
Cholesky factor in compressed format. (Output)
INZSUB — Vector of length N + 1 containing pointers for NZSUB. The row subscripts for the off-diagonal
nonzeros in column J are stored in NZSUB from location INZSUB (J) to INZSUB(J + (ILNZ
(J + 1) −ILNZ(J) − 1). (Output)
MAXNZ — Total number of off-diagonal nonzeros in the Cholesky factor. (Output)
ILNZ — Vector of length N + 1 containing pointers to the Cholesky factor. The off-diagonal nonzeros in column J of the factor are stored from location ILNZ (J) to ILNZ(J + 1) − 1. (Output)
(ILNZ, NZSUB, INZSUB) sets up the data structure for the off-diagonal nonzeros of the Cholesky factor
in column ordered form using compressed subscript format.
INVPER — Vector of length N containing the inverse permutation. (Output)
INVPER (K) = I indicates that the original row K is the new row I.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (INVPER,1).
NZ — Total number of the nonzeros in the lower triangular part of the symmetric matrix, including the
nonzeros on the diagonal. (Input)
Default: NZ = size (IROW,1).
IJOB — Integer parameter selecting an ordering to permute the matrix symmetrically. (Input)
IJOB = 0 selects the user ordering specified in IPER and reorders it so that the multifrontal method
can be used in the numerical factorization.
IJOB = 1 selects the user ordering specified in IPER.
IJOB = 2 selects a minimum degree ordering.
IJOB = 3 selects a minimum degree ordering suitable for the multifrontal method in the numerical factorization.
Default: IJOB = 3.
ITWKSP — The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
LSCXD
Chapter 1: Linear Systems
450
MAXSUB — Number of subscripts contained in array NZSUB. (Input/Output)
On input, MAXSUB gives the size of the array NZSUB. Note that when default workspace (ITWKSP = 0)
is used, set MAXSUB = 3 * NZ. Otherwise (ITWKSP > 0), set MAXSUB = (ITWKSP - 10 * N - 7) ∕ 4. On
output, MAXSUB gives the number of subscripts used by the compressed subscript format.
Default: MAXSUB = 3 * NZ.
IPER — Vector of length N containing the ordering specified by IJOB. (Input/Output)
IPER (I) = K indicates that the original row K is the new row I.
ISPACE — The storage space needed for stack of frontal matrices. (Output)
FORTRAN 90 Interface
Generic:
Because the Fortran compiler cannot determine the precision desired from the required
arguments, there is no generic Fortran 90 Interface for this routine. The specific Fortran 90
Interfaces are:
Single:
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [, …])
Or
CALL S_LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [, …])
Double:
CALL DLSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [, …])
Or
CALL D_LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [, …])
FORTRAN 77 Interface
Single:
CALL LSCXD (N, NZ, IROW, JCOL, IJOB, ITWKSP, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ,
IPER, INVPER, ISPACE)
Double:
The double precision name is DLSCXD.
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A requires
one real and two integer vectors. The real array a contains all the nonzeros in the lower triangle of A including
the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz,
contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LSCXD computes a minimum degree ordering or uses a user-supplied ordering to set up the
sparse data structure for the Cholesky factor, L. Then the routine LNFXD produces the numerical entries in L
so that we have
P APT= LLT
LSCXD
Chapter 1: Linear Systems
451
Here, P is the permutation matrix determined by the ordering.
The numerical computations can be carried out in one of two ways. The first method performs the factorization using a multifrontal technique. This option requires more storage but in certain cases will be faster. The
multifrontal method is based on the routines in Liu (1987). For detailed description of this method, see Liu
(1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and Liu (1981). This is just the standard factorization method based
on the sparse compressed storage scheme.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CXD. The reference is:
CALL L2CXD (N, NZ, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER, INVPER,
ISPACE, LIWK, IWK)
The additional arguments are as follows:
LIWK — The length of IWK, LIWK should be at least 10N + 12NZ + 7. Note that the argument
MAXSUB should be set to (LIWK - 10N - 7) / 4.
IWK — Integer work vector of length LIWK.
Note that the parameter ITWKSP is not an argument to this routine.
2.
Informational errors
Type
Code
Description
4
1
The matrix is structurally singular.
Example
As an example, the following matrix is symbolically factorized, and the result is printed:
$
The number of nonzeros in the lower triangle of A is nz = 10. The sparse coordinate form for the lower triangle of A is given by:
irow
1
2
3
3
4
4
5
5
5
5
jcol
1
2
1
3
3
4
1
2
4
5
irow
4
5
5
5
1
2
3
3
4
5
jcol
4
1
2
4
1
2
1
3
3
5
or equivalently by
USE LSCXD_INT
USE WRIRN_INT
INTEGER
N, NZ
LSCXD
Chapter 1: Linear Systems
452
PARAMETER
(N=5, NZ=10)
INTEGER
ILNZ(N+1), INVPER(N), INZSUB(N+1), IPER(N),&
IROW(NZ), ISPACE, JCOL(NZ), MAXNZ, MAXSUB,&
NZSUB(3*NZ)
!
!
DATA IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
DATA JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
MAXSUB = 3 * NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER,&
MAXSUB=MAXSUB, IPER=IPER)
Print results
CALL WRIRN (’ iper ’, IPER, 1, N, 1)
CALL WRIRN (’ invper ’,INVPER, 1, N, 1)
CALL WRIRN (’ nzsub ’, NZSUB, 1, MAXSUB, 1)
CALL WRIRN (’ inzsub ’, INZSUB, 1, N+1, 1)
CALL WRIRN (’ ilnz ’, ILNZ, 1, N+1, 1)
END
!
Output
1
2
2
1
3
5
iper
4
4
5
3
1
2
2
1
invper
3
4
5
4
5
3
1
3
2
5
1
1
nzsub
3
4
4
5
inzsub
2
3
4
5
1
3
4
4
6
4
1
1
2
2
ilnz
4
5
6
7
6
7
3
4
LSCXD
Chapter 1: Linear Systems
453
LNFXD
Computes the numerical Cholesky factorization of a sparse symmetrical matrix A.
Required Arguments
A — Vector of length NZ containing the nonzero coefficients of the lower triangle of the linear system.
(Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in the lower triangle of A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in the lower
triangle of A. (Input)
MAXSUB — Number of subscripts contained in array NZSUB as output from subroutine LSCXD/DLSCXD.
(Input)
NZSUB — Vector of length MAXSUB containing the row subscripts for the nonzeros in the Cholesky factor
in compressed format as output from subroutine LSCXD/DLSCXD. (Input)
INZSUB — Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J are stored from location INZSUB (J) to
INZSUB(J + 1) - 1.
MAXNZ — Length of RLNZ as output from subroutine LSCXD/DLSCXD. (Input)
ILNZ — Vector of length N + 1 containing pointers to the Cholesky factor as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J of the factor are stored from location ILNZ(J) to
ILNZ(J + 1) - 1. (ILNZ, NZSUB, INZSUB) sets up the compressed data structure in column ordered
form for the Cholesky factor.
IPER — Vector of length N containing the permutation as output from subroutine LSCXD/DLSCXD.
(Input)
INVPER — Vector of length N containing the inverse permutation as output from subroutine
LSCXD/DLSCXD. (Input)
ISPACE — The storage space needed for the stack of frontal matrices as output from subroutine
LSCXD/DLSCXD. (Input)
DIAGNL — Vector of length N containing the diagonal of the factor. (Output)
RLNZ — Vector of length MAXNZ containing the strictly lower triangle nonzeros of the Cholesky factor.
(Output)
RPARAM — Parameter vector containing factorization information. (Output)
RPARAM(1) = smallest diagonal element.
RPARAM(2) = largest diagonal element.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (IPER,1).
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
LNFXD
Chapter 1: Linear Systems
454
IJOB — Integer parameter selecting factorization method. (Input)
IJOB = 1 yields factorization in sparse column format.
IJOB = 2 yields factorization using multifrontal method.
Default: IJOB = 1.
ITWKSP — The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER, INVPER,
ISPACE, DIAGNL, RLNZ, RPARAM [, …])
Specific:
The specific interface names are S_LNFXD and D_LNFXD.
FORTRAN 77 Interface
Single:
CALL LNFXD (N, NZ, A, IROW, JCOL, IJOB, ITWKSP, MAXSUB, NZSUB, INZSUB, MAXNZ,
ILNZ, IPER, INVPER, ISPACE, ITWKSP, DIAGNL, RLNZ, RPARAM)
Double:
The double precision name is DLNFXD.
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A requires
one real and two integer vectors. The real array a contains all the nonzeros in the lower triangle of A including
the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz,
contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero. The routine LNFXD produces the Cholesky factorization
of P APTgiven the symbolic factorization of A which is computed by LSCXD. That is, this routine computes L
which satisfies
P APT= LLT
The diagonal of L is stored in DIAGNL and the strictly lower triangular part of L is stored in compressed subscript form in R = RLNZ as follows. The nonzeros in the j-th column of L are stored in locations
R(i), …, R(i + k) where i = ILNZ(j) and k = ILNZ(j + 1) − ILNZ(j) − 1. The row subscripts are stored in the
vector NZSUB from locations INZSUB(j) to INZSUB(j) + k.
The numerical computations can be carried out in one of two ways. The first method (when IJOB = 2) performs the factorization using a multifrontal technique. This option requires more storage but in certain cases
will be faster. The multifrontal method is based on the routines in Liu (1987). For detailed description of this
LNFXD
Chapter 1: Linear Systems
455
method, see Liu (1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu
(1986, 1989). The second method (when IJOB = 1) is fully described in George and Liu (1981). This is just the
standard factorization method based on the sparse compressed storage scheme.
Comments
1.
Workspace may be explicitly provided by use of L2FXD/DL2FXD . The reference is:
CALL L2FXD (N, NZ, A, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER,
INVPER, ISPACE, DIAGNL, RLNZ, RPARAM, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Real work vector of length LWK.
LWK — The length of WK, LWK should be at least N + 3NZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 2N.
Note that the parameter ITWKSP is not an argument to this routine.
2.
Informational errors
Type
Code
Description
4
1
The coefficient matrix is not positive definite.
4
2
A column without nonzero elements has been found in the coefficient
matrix.
Example
As an example, consider the 5 × 5 linear system:
$
The number of nonzeros in the lower triangle of A is nz = 10. The sparse coordinate form for the lower triangle of A is given by:
LURZ MFRO D or equivalently by
LNFXD
Chapter 1: Linear Systems
456
LURZ MFRO D We first call LSCXD to produce the symbolic information needed to pass on to LNFXD. Then call LNFXD to factor this matrix. The results are displayed below.
USE LNFXD_INT
USE LSCXD_INT
USE WRRRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=5, NZ=10, NRLNZ=10)
!
INTEGER
REAL
IJOB, ILNZ(N+1), INVPER(N), INZSUB(N+1), IPER(N),&
IROW(NZ), ISPACE, JCOL(NZ), MAXNZ, MAXSUB,&
NZSUB(3*NZ)
A(NZ), DIAGNL(N), RLNZ(NRLNZ), RPARAM(2) , R(N,N)
!
!
!
!
!
!
!
DATA A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
DATA IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
DATA JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Select minimum degree ordering
for multifrontal method
IJOB = 3
Use default workspace
MAXSUB = 3*NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
MAXSUB=MAXSUB)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, &
ILNZ,IPER, INVPER, ISPACE, DIAGNL, RLNZ, RPARAM, &
IJOB=IJOB)
Print results
CALL WRRRN (' diagnl ', DIAGNL, NRA=1, NCA=N, LDA=1)
CALL WRRRN (' rlnz ', RLNZ, NRA= 1, NCA= MAXNZ, LDA= 1)
END IF
!
!
Construct L matrix
DO I=1,N
!
!
!
!
Diagonal
R(I,I) = DIAG(I)
IF (ILNZ(I) .GT. MAXNZ) GO TO 50
Find elements of RLNZ for this column
ISTRT = ILNZ(I)
ISTOP = ILNZ(I+1) - 1
Get starting index for NZSUB
K = INZSUB(I)
DO J=ISTRT, ISTOP
NZSUB(K) is the row for this element of
RLNZ
R((NZSUB(K)),I) = RLNZ(J)
LNFXD
Chapter 1: Linear Systems
457
K = K + 1
END DO
END DO
CONTINUE
CALL WRRRN ('L', R, NRA=N, NCA=N)
END
50
Output
1
4.472
diagnl
3
4
7.011
6.284
2
3.162
5
5.430
rlnz
1
0.6708
1
2
3
4
5
2
0.6325
1
4.472
0.000
0.671
0.000
0.000
3
0.3162
2
0.000
3.162
0.632
0.000
0.316
3
0.000
0.000
7.011
0.713
-0.029
4
0.7132
L
4
0.000
0.000
0.000
6.284
0.640
5
-0.0285
6
0.6398
5
0.000
0.000
0.000
0.000
5.430
LNFXD
Chapter 1: Linear Systems
458
LFSXD
Solves a real sparse symmetric positive definite system of linear equations, given the Cholesky factorization
of the coefficient matrix.
Required Arguments
N — Number of equations. (Input)
MAXSUB — Number of subscripts contained in array NZSUB as output from subroutine LSCXD/DLSCXD.
(Input)
NZSUB — Vector of length MAXSUB containing the row subscripts for the off-diagonal nonzeros in the factor as output from subroutine LSCXD/DLSCXD. (Input)
INZSUB — Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts of column J are stored from location INZSUB(J) to
INZSUB(J + 1) - 1.
MAXNZ — Total number of off-diagonal nonzeros in the Cholesky factor as output from subroutine
LSCXD/DLSCXD. (Input)
RLNZ — Vector of length MAXNZ containing the off-diagonal nonzeros in the factor in column ordered format as output from subroutine LNFXD/DLNFXD. (Input)
ILNZ — Vector of length N + 1 containing pointers to RLNZ as output from subroutine LSCXD/DLSCXD.
The nonzeros in column J of the factor are stored from location ILNZ(J) to ILNZ(J + 1) - 1. (Input)
The values (RLNZ, ILNZ, NZSUB, INZSUB) give the off-diagonal nonzeros of the factor in a compressed
subscript data format.
DIAGNL — Vector of length N containing the diagonals of the Cholesky factor as output from subroutine
LNFXD/DLNFXD. (Input)
IPER — Vector of length N containing the ordering as output from subroutine LSCXD/DLSCXD. (Input)
IPER(I) = K indicates that the original row K is the new row I.
B — Vector of length N containing the right-hand side. (Input)
X — Vector of length N containing the solution. (Output)
FORTRAN 90 Interface
Generic:
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL, IPER, B, X)
Specific:
The specific interface names are S_LFSXD and D_LFSXD.
FORTRAN 77 Interface
Single:
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL, IPER, B, X)
Double:
The double precision name is DLFSXD.
Description
Consider the linear equation
Ax = b
LFSXD
Chapter 1: Linear Systems
459
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A requires
one real and two integer vectors. The real array a contains all the nonzeros in the lower triangle of A including
the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz,
contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LFSXD computes the solution of the linear system given its Cholesky factorization. The factorization is performed by calling LSCXD followed by LNFXD. The routine LSCXD computes a minimum degree
ordering or uses a user-supplied ordering to set up the sparse data structure for the Cholesky factor, L. Then
the routine LNFXD produces the numerical entries in L so that we have
P APT= LLT
Here P is the permutation matrix determined by the ordering.
The numerical computations can be carried out in one of two ways. The first method performs the factorization using a multifrontal technique. This option requires more storage but in certain cases will be faster. The
multifrontal method is based on the routines in Liu (1987). For detailed description of this method, see Liu
(1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and Liu (1981). This is just the standard factorization method based
on the sparse compressed storage scheme.
Finally, the solution x is obtained by the following calculations:
1) Ly1 = Pb
2) LTy2 = y1
3) x = PTy2
Comments
Informational error
Type
Code
Description
4
1
The input matrix is numerically singular.
Example
As an example, consider the 5 × 5 linear system:
LFSXD
Chapter 1: Linear Systems
460
$
Let
x1T = (1,2,3,4,5)
so that Ax1 = (23, 55, 107, 197, 278)T, and
x2T = (5,4,3,2,1)
so that Ax2 = (55, 83, 103, 97, 82)T. The number of nonzeros in the lower triangle of A is nz = 10. The sparse
coordinate form for the lower triangle of A is given by:
LURZ MFRO D or equivalently by
LURZ MFRO D USE LFSXD_INT
USE LNFXD_INT
USE LSCXD_INT
USE WRRRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=5, NZ=10, NRLNZ=10)
!
INTEGER
REAL
IJOB, ILNZ(N+1), INVPER(N), INZSUB(N+1), IPER(N),&
IROW(NZ), ISPACE, ITWKSP, JCOL(NZ), MAXNZ, MAXSUB,&
NZSUB(3*NZ)
A(NZ), B1(N), B2(N), DIAGNL(N), RLNZ(NRLNZ), RPARAM(2),&
X(N)
!
DATA
DATA
DATA
DATA
DATA
!
!
!
A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
B1/23., 55., 107., 197., 278./
B2/55., 83., 103., 97., 82./
IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Select minimum degree ordering
for multifrontal method
IJOB = 3
Use default workspace
ITWKSP = 0
LFSXD
Chapter 1: Linear Systems
461
!
!
!
!
!
!
MAXSUB = 3*NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
MAXSUB=MAXSUB, IPER=IPER, ISPACE=ISPACE)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ,&
IPER, INVPER,ISPACE, DIAGNL, RLNZ, RPARAM, IJOB=IJOB)
Solve A * X1 = B1
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,&
IPER, B1, X)
Print X1
CALL WRRRN (’ x1 ’, X, 1, N, 1)
Solve A * X2 = B2
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, &
DIAGNL, IPER, B2, X)
Print X2
CALL WRRRN (’ x2 ’ X, 1, N, 1)
END IF
!
END
Output
1
1.000
2
2.000
x1
3
3.000
4
4.000
5
5.000
1
5.000
2
4.000
x2
3
3.000
4
2.000
5
1.000
LFSXD
Chapter 1: Linear Systems
462
LSLZD
Solves a complex sparse Hermitian positive definite system of linear equations by Gaussian elimination.
Required Arguments
A — Complex vector of length NZ containing the nonzero coefficients in the lower triangle of the linear
system. (Input)
The sparse matrix has nonzeroes only in entries (IROW(i), JCOL(i)) for i = 1 to NZ, and at this location
the sparse matrix has value A(i).
IROW — Vector of length NZ containing the row numbers of the corresponding elements in the lower triangle of A. (Input)
Note IROW(i) ≥ JCOL(i), since we are only indexing the lower triangle.
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in the lower
triangle of A. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N — Number of equations. (Input)
Default: N = size (B,1).
NZ — The number of nonzero coefficients in the lower triangle of the linear system. (Input)
Default: NZ = size (A,1).
ITWKSP — The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LSLZD (A, IROW, JCOL, B, X [, …])
Specific:
The specific interface names are S_LSLZD and D_LSLZD.
FORTRAN 77 Interface
Single:
CALL LSLZD (N, NZ, A, IROW, JCOL, B, ITWKSP, X)
Double:
The double precision name is DLSLZD.
Description
Consider the linear equation
Ax = b
LSLZD
Chapter 1: Linear Systems
463
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in the lower triangle of
A including the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of
length nz, contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LSLZD solves a system of linear algebraic equations having a complex, sparse, Hermitian and
positive definite coefficient matrix. It first uses the routine LSCXD to compute a symbolic factorization of a
permutation of the coefficient matrix. It then calls LNFZD to perform the numerical factorization. The solution of the linear system is then found using LFSZD.
The routine LSCXD computes a minimum degree ordering or uses a user-supplied ordering to set up the
sparse data structure for the Cholesky factor, L. Then the routine LNFZD produces the numerical entries in L
so that we have
P APT = LLH
Here P is the permutation matrix determined by the ordering.
The numerical computations can be carried out in one of two ways. The first method performs the factorization using a multifrontal technique. This option requires more storage but in certain cases will be faster. The
multifrontal method is based on the routines in Liu (1987). For detailed description of this method, see Liu
(1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and Liu (1981). This is just the standard factorization method based
on the sparse compressed storage scheme.
Finally, the solution x is obtained by the following calculations:
1)
Ly1 = Pb
2) LH y2 = y1
3) x = PT y2
The routine LFSZD accepts b and the permutation vector which determines P . It then returns x.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LZD/DL2LZD. The reference is:
CALL L2LZD (N, NZ, A, IROW, JCOL, B, X, IPER, IPARAM, RPARAM, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
IPER — Vector of length N containing the ordering.
IPARAM — Integer vector of length 4. See Comment 3.
RPARAM — Real vector of length 2. See Comment 3.
WK — Complex work vector of length LWK.
LWK — The length of WK, LWK should be at least 2N + 6NZ.
LSLZD
Chapter 1: Linear Systems
464
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 15N + 15NZ + 9.
Note that the parameter ITWKSP is not an argument for this routine.
2.
3.
Informational errors
Type
Code
Description
4
1
The coefficient matrix is not positive definite.
4
2
A column without nonzero elements has been found in the coefficient
matrix.
If the default parameters are desired for L2LZD, then set IPARAM(1) to zero and call the routine L2LZD.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM, then the following steps
should be taken before calling L2LZD.
CALL L4LZD (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LZD will set IPARAM and RPARAM to their default values, so only nondefault values
need to be set above. The arguments are as follows:
IPARAM — Integer vector of length 4.
IPARAM(1) = Initialization flag.
IPARAM(2) = The numerical factorization method.
IPARAM(2)
Action
0
Multifrontal
1
Sparse column
Default: 0.
IPARAM(3) = The ordering option.
IPARAM(3)
Action
0
Minimum degree ordering
1
User’s ordering specified in IPER
Default: 0.
IPARAM(4) = The total number of nonzeros in the factorization matrix.
RPARAM — Real vector of length 2.
RPARAM(1) = The absolute value of the largest diagonal element in the Cholesky factorization.
RPARAM(2) = The absolute value of the smallest diagonal element in the Cholesky factorization.
If double precision is required, then DL4LZD is called and RPARAM is declared double precision.
Example
As an example, consider the 3 × 3 linear system:
LSLZD
Chapter 1: Linear Systems
465
$
L í L
í í L L L
í L L
Let xT = (1 + i, 2 + 2i, 3 + 3i) so that Ax = (−2 + 2i, 5 + 15i, 36 + 28i)T. The number of nonzeros in the lower triangle of A is nz = 5. The sparse coordinate form for the lower triangle of A is given by:
LURZ
MFRO
D L L L í í L í L
or equivalently by
LURZ
MFRO
D L í í L í L L L
USE LSLZD_INT
USE WRCRN_INT
INTEGER
N, NZ
PARAMETER (N=3, NZ=5)
!
INTEGER
COMPLEX
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
!
DATA
DATA
DATA
DATA
!
!
A/(2.0,0.0), (4.0,0.0), (10.0,0.0), (-1.0,-1.0), (1.0,-2.0)/
B/(-2.0,2.0), (5.0,15.0), (36.0,28.0)/
IROW/1, 2, 3, 2, 3/
JCOL/1, 2, 3, 1, 2/
Solve A * X = B
CALL LSLZD (A, IROW, JCOL, B, X)
Print results
CALL WRCRN (’ x ’, X, 1, N, 1)
END
Output
x
1
( 1.000, 1.000)
2
( 2.000, 2.000)
3
( 3.000, 3.000)
LSLZD
Chapter 1: Linear Systems
466
LNFZD
Computes the numerical Cholesky factorization of a sparse Hermitian matrix A.
Required Arguments
A — Complex vector of length NZ containing the nonzero coefficients of the lower triangle of the linear
system. (Input)
IROW — Vector of length NZ containing the row numbers of the corresponding elements in the lower triangle of A. (Input)
JCOL — Vector of length NZ containing the column numbers of the corresponding elements in the lower
triangle of A. (Input)
MAXSUB — Number of subscripts contained in array NZSUB as output from subroutine LSCXD/DLSCXD.
(Input)
NZSUB — Vector of length MAXSUB containing the row subscripts for the nonzeros in the Cholesky factor
in compressed format as output from subroutine LSCXD/DLSCXD. (Input)
INZSUB — Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J are stored from location INZSUB(J) to
INZSUB(J + 1) - 1.
MAXNZ — Length of RLNZ as output from subroutine LSCXD/DLSCXD. (Input)
ILNZ — Vector of length N + 1 containing pointers to the Cholesky factor as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J of the factor are stored from location ILNZ(J) to
ILNZ(J + 1) − 1. (ILNZ , NZSUB, INZSUB) sets up the compressed data structure in column ordered
form for the Cholesky factor.
IPER — Vector of length N containing the permutation as output from subroutine LSCXD/DLSCXD.
(Input)
INVPER — Vector of length N containing the inverse permutation as output from subroutine
LSCXD/DLSCXD. (Input)
ISPACE — The storage space needed for the stack of frontal matrices as output from subroutine
LSCXD/DLSCXD. (Input)
DIAGNL — Complex vector of length N containing the diagonal of the factor. (Output)
RLNZ — Complex vector of length MAXNZ containing the strictly lower triangle nonzeros of the Cholesky
factor. (Output)
RPARAM — Parameter vector containing factorization information. (Output)
RPARAM (1) = smallest diagonal element in absolute value.
RPARAM (2) = largest diagonal element in absolute value.
Optional Arguments
N — Number of equations. (Input)
Default: N = size (IPER,1).
LNFZD
Chapter 1: Linear Systems
467
NZ — The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IJOB — Integer parameter selecting factorization method. (Input)
IJOB = 1 yields factorization in sparse column format.
IJOB = 2 yields factorization using multifrontal method.
Default: IJOB = 1.
ITWKSP — The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero. See Comment 1 for the default.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LNFZD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER, INVPER,
ISPACE, DIAGNL, RLNZ, RPARAM [, …])
Specific:
The specific interface names are S_LNFZD and D_LNFZD.
FORTRAN 77 Interface
Single:
CALL LNFZD (N, NZ, A, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER,
INVPER, ISPACE, ITWKSP, DIAGNL, RLNZ, RPARAM)
Double:
The double precision name is DLNFZD.
Description
Consider the linear equation
$[
E
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in the lower triangle of A
including the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of
length nz, contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LNFZD produces the Cholesky factorization of P APT given the symbolic factorization of A which
is computed by LSCXD. That is, this routine computes L which satisfies
P APT= LLH
The diagonal of L is stored in DIAGNL and the strictly lower triangular part of L is stored in compressed subscript form in R = RLNZ as follows. The nonzeros in the jth column of L are stored in locations R(i), …, R(i + k)
where i = ILNZ(j) and k = ILNZ(j + 1)− ILNZ(j) − 1. The row subscripts are stored in the vector NZSUB from
locations INZSUB(j) to INZSUB(j) + k.
LNFZD
Chapter 1: Linear Systems
468
The numerical computations can be carried out in one of two ways. The first method
(when IJOB = 2) performs the factorization using a multifrontal technique. This option requires more storage
but in certain cases will be faster. The multifrontal method is based on the routines in Liu (1987). For detailed
description of this method, see Liu (1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al.
(1987), and Liu (1986, 1989). The second method (when IJOB = 1) is fully described in George and Liu (1981).
This is just the standard factorization method based on the sparse compressed storage scheme.
Comments
1.
Workspace may be explicitly provided by use of L2FZD/DL2FZD. The reference is:
CALL L2FZD (N, NZ, A, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER,
INVPER, ISPACE, DIAGNL, RLNZ, RPARAM, WK, LWK, IWK, LIWK)
The additional arguments are as follows:
WK — Complex work vector of length LWK.
LWK — The length of WK, LWK should be at least N + 3NZ.
IWK — Integer work vector of length LIWK.
LIWK — The length of IWK, LIWK should be at least 2N.
Note that the parameter ITWKSP is not an argument to this routine.
2.
Informational errors
Type
Code
Description
4
1
The coefficient matrix is not positive definite.
4
2
A column without nonzero elements has been found in the coefficient
matrix.
Example
As an example, consider the 3 × 3 linear system:
$
L í L
í í L L L
í L L
The number of nonzeros in the lower triangle of A is nz = 5. The sparse coordinate form for the lower triangle of A is given by:
LURZ
MFRO
D L L L í í L í L
or equivalently by
LNFZD
Chapter 1: Linear Systems
469
LURZ
MFRO
D L í í L í L L L
We first call LSCXD to produce the symbolic information needed to pass on to LNFZD. Then call LNFZD to factor this matrix. The results are displayed below.
USE LNFZD_INT
USE LSCXD_INT
USE WRCRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=3, NZ=5, NRLNZ=5)
!
INTEGER
REAL
COMPLEX
IJOB, ILNZ(N+1), INVPER(N), INZSUB(N+1), IPER(N),&
IROW(NZ), ISPACE, JCOL(NZ), MAXNZ, MAXSUB,&
NZSUB(3*NZ)
RPARAM(2)
A(NZ), DIAGNL(N), RLNZ(NRLNZ)
!
!
!
!
!
!
DATA A/(2.0,0.0), (4.0,0.0), (10.0,0.0), (-1.0,-1.0), (1.0,-2.0)/
DATA IROW/1, 2, 3, 2, 3/
DATA JCOL/1, 2, 3, 1, 2/
Select minimum degree ordering
for multifrontal method
IJOB = 3
MAXSUB = 3*NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
IJOB=IJOB, MAXSUB=MAXSUB)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFZD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, &
ILNZ, IPER, INVPER, ISPACE, DIAGNL, RLNZ, RPARAM, &
IJOB=IJOB)
Print results
CALL WRCRN (’ diagnl ’, DIAGNL, 1, N, 1)
CALL WRCRN (’ rlnz ’, RLNZ, 1, MAXNZ, 1)
END IF
!
END
Output
diagnl
1
( 1.414, 0.000)
2
( 1.732, 0.000)
3
( 2.887, 0.000)
rlnz
1
(-0.707,-0.707)
2
( 0.577,-1.155)
LNFZD
Chapter 1: Linear Systems
470
LFSZD
Solves a complex sparse Hermitian positive definite system of linear equations, given the Cholesky factorization of the coefficient matrix.
Required Arguments
N — Number of equations. (Input)
MAXSUB — Number of subscripts contained in array NZSUB as output from subroutine LSCXD/DLSCXD.
(Input)
NZSUB — Vector of length MAXSUB containing the row subscripts for the off-diagonal nonzeros in the factor as output from subroutine LSCXD/DLSCXD. (Input)
INZSUB — Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts of column J are stored from location INZSUB(J) to INZSUB(J + 1) − 1.
MAXNZ — Total number of off-diagonal nonzeros in the Cholesky factor as output from subroutine
LSCXD/DLSCXD. (Input)
RLNZ — Complex vector of length MAXNZ containing the off-diagonal nonzeros in the factor in column
ordered format as output from subroutine LNFZD/DLNFZD. (Input)
ILNZ — Vector of length N +1 containing pointers to RLNZ as output from subroutine LSCXD/DLSCXD.
The nonzeros in column J of the factor are stored from location ILNZ(J) to ILNZ(J + 1) − 1. (Input)
The values (RLNZ, ILNZ, NZSUB, INZSUB) give the off-diagonal nonzeros of the factor in a compressed
subscript data format.
DIAGNL — Complex vector of length N containing the diagonals of the Cholesky factor as output from
subroutine LNFZD/DLNFZD. (Input)
IPER — Vector of length N containing the ordering as output from subroutine LSCXD/DLSCXD. (Input)
IPER(I) = K indicates that the original row K is the new row I.
B — Complex vector of length N containing the right-hand side. (Input)
X — Complex vector of length N containing the solution. (Output)
FORTRAN 90 Interface
Generic:
CALL LFSZD (N, MAXZUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL, IPER, B, X)
Specific:
The specific interface names are S_LFSZD and D_LFSZD.
FORTRAN 77 Interface
Single:
CALL LFSZD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL, IPER, B, X)
Double:
The double precision name is DLFSZD.
Description
Consider the linear equation
Ax = b
LFSZD
Chapter 1: Linear Systems
471
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in the lower triangle of A
including the diagonal. Let the number of nonzeros be nz. The two integer arrays irow and jcol, each of
length nz, contain the row and column indices for these entries in A. That is
Airow(i), icol(i) = a(i), i = 1, …, nz
irow(i) ≥ jcol(i) i = 1, …, nz
with all other entries in the lower triangle of A zero.
The routine LFSZD computes the solution of the linear system given its Cholesky factorization. The factorization is performed by calling LSCXD followed by LNFZD. The routine LSCXD computes a minimum degree
ordering or uses a user-supplied ordering to set up the sparse data structure for the Cholesky factor, L. Then
the routine LNFZD produces the numerical entries in L so that we have
P APT = LLH
Here P is the permutation matrix determined by the ordering.
The numerical computations can be carried out in one of two ways. The first method performs the factorization using a multifrontal technique. This option requires more storage but in certain cases will be faster. The
multifrontal method is based on the routines in Liu (1987). For detailed description of this method, see Liu
(1990), also Duff and Reid (1983, 1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and Liu (1981). This is just the standard factorization method based
on the sparse compressed storage scheme. Finally, the solution x is obtained by the following calculations:
1) Ly1 = Pb
2) LH y2 = y1
3) x = PT y2
Comments
Informational error
Type
Code
Description
4
1
The input matrix is numerically singular.
Example
As an example, consider the 3 × 3 linear system:
$
L í L
í í L L L
í L L
Let
x1T = (1 + i, 2 + 2i, 3 + 3i)
LFSZD
Chapter 1: Linear Systems
472
so that Ax1 = (−2 + 2i, 5 + 15i, 36 + 28i)T, and
x2T = (3 + 3i, 2 + 2i, 1 + i)
so that Ax2 = (2 + 6i, 7 − 5i, 16 + 8i)T. The number of nonzeros in the lower triangle of A is nz = 5. The sparse
coordinate form for the lower triangle of A is given by:
LURZ
MFRO
D L L L í í L í L
or equivalently by
LURZ
MFRO
D L í í L í L L L
USE IMSL_LIBRARIES
INTEGER
N, NZ, NRLNZ
PARAMETER (N=3, NZ=5, NRLNZ=5)
!
INTEGER
COMPLEX
REAL
IJOB, ILNZ(N+1), INVPER(N), INZSUB(N+1), IPER(N),&
IROW(NZ), ISPACE, JCOL(NZ), MAXNZ, MAXSUB,&
NZSUB(3*NZ)
A(NZ), B1(N), B2(N), DIAGNL(N), RLNZ(NRLNZ), X(N)
RPARAM(2)
!
DATA
DATA
DATA
DATA
DATA
!
!
!
!
!
!
!
!
A/(2.0,0.0), (4.0,0.0), (10.0,0.0), (-1.0,-1.0), (1.0,-2.0)/
B1/(-2.0,2.0), (5.0,15.0), (36.0,28.0)/
B2/(2.0,6.0), (7.0,5.0), (16.0,8.0)/
IROW/1, 2, 3, 2, 3/
JCOL/1, 2, 3, 1, 2/
Select minimum degree ordering
for multifrontal method
IJOB = 3
Use default workspace
MAXSUB = 3*NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
IJOB=IJOB, MAXSUB=MAXSUB, IPER=IPER, ISPACE=ISPACE)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFZD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB,&
MAXNZ, ILNZ, IPER, INVPER, ISPACE, DIAGNL,&
RLNZ, RPARAM, IJOB=IJOB)
Solve A * X1 = B1
CALL LFSZD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,&
IPER, B1, X)
Print X1
CALL WRCRN (’ x1 ’, X, 1, N,1)
Solve A * X2 = B2
LFSZD
Chapter 1: Linear Systems
473
!
CALL LFSZD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,&
IPER, B2, X)
Print X2
CALL WRCRN (’ x2 ’, X, 1, N,1)
END IF
!
END
Output
x1
1
( 1.000, 1.000)
2
( 2.000, 2.000)
3
( 3.000, 3.000)
x2
1
( 3.000, 3.000)
2
( 2.000, 2.000)
3
( 1.000, 1.000)
LFSZD
Chapter 1: Linear Systems
474
LSLTO
Solves a complex sparse Hermitian positive definite system of linear equations, given the Cholesky factorization of the coefficient matrix.
Required Arguments
A — Real vector of length 2N − 1 containing the first row of the coefficient matrix followed by its first column beginning with the second element. (Input)
See Comment 2.
B — Real vector of length N containing the right-hand side of the linear system. (Input)
X — Real vector of length N containing the solution of the linear system. (Output)
If B is not needed then B and X may share the same storage locations.
Optional Arguments
N — Order of the matrix represented by A. (Input)
Default: N = (size (A,1) + 1)/2
IPATH — Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
IPATH = 2 means the system AT x = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
CALL LSLTO (A, B, X [, …])
Specific:
The specific interface names are S_LSLTO and D_LSLTO.
FORTRAN 77 Interface
Single:
CALL LSLTO (N, A, B, IPATH, X)
Double:
The double precision name is DLSLTO.
Description
Toeplitz matrices have entries that are constant along each diagonal, for example,
$
S S S
Sí S S
Sí Sí S
Sí Sí Sí
S
S
S
S
The routine LSLTO is based on the routine TSLS in the TOEPLITZ package, see Arushanian et al. (1983). It is
based on an algorithm of Trench (1964). This algorithm is also described by Golub and van Loan (1983),
pages 125−133.
LSLTO
Chapter 1: Linear Systems
475
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LTO/DL2LTO. The reference is:
CALL L2LTO (N, A, B, IPATH, X, WK)
The additional argument is:
WK — Work vector of length 2N − 2.
2.
Because of the special structure of Toeplitz matrices, the first row and the first column of a Toeplitz
matrix completely characterize the matrix. Hence, only the elements
A(1, 1), …, A(1, N), A(2, 1), …, A(N, 1) need to be stored.
Example
A system of four linear equations is solved. Note that only the first row and column of the matrix A are
entered.
USE LSLTO_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
N
(N=4)
A(2*N-1), B(N), X(N)
Set values for A, and B
A = (
(
(
(
2
1
4
3
-3
2
1
4
-1
-3
2
1
6
-1
-3
2
)
)
)
)
B = ( 16
-29
-7
5
)
!
DATA A/2.0, -3.0, -1.0, 6.0, 1.0, 4.0, 3.0/
DATA B/16.0, -29.0, -7.0, 5.0/
Solve AX = B
CALL LSLTO (A, B, X)
Print results
CALL WRRRN (’X’, X, 1, N, 1)
END
!
Output
X
1
-2.000
2
-1.000
3
7.000
4
4.000
LSLTO
Chapter 1: Linear Systems
476
LSLTC
Solves a complex Toeplitz linear system.
Required Arguments
A — Complex vector of length 2N − 1 containing the first row of the coefficient matrix followed by its first
column beginning with the second element. (Input)
See Comment 2.
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N — Order of the matrix represented by A. (Input)
Default: N = size (A,1).
IPATH — Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
IPATH = 2 means the system ATx = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLTC (A, B, X [, …])
Specific:
The specific interface names are S_LSLTC and D_LSLTC.
FORTRAN 77 Interface
Single:
CALL LSLTC (N, A, B, IPATH, X)
Double:
The double precision name is DLSLTC.
Description
Toeplitz matrices have entries which are constant along each diagonal, for example,
$
S S S
Sí S S
Sí Sí S
Sí Sí Sí
S
S
S
S
The routine LSLTC is based on the routine TSLC in the TOEPLITZ package, see Arushanian et al. (1983). It is
based on an algorithm of Trench (1964). This algorithm is also described by Golub and van Loan (1983),
pages 125−133.
LSLTC
Chapter 1: Linear Systems
477
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LTC/DL2LTC. The reference is:
CALL L2LTC (N, A, B, IPATH, X, WK)
The additional argument is
WK — Complex work vector of length 2N − 2.
2.
Because of the special structure of Toeplitz matrices, the first row and the first column of a Toeplitz
matrix completely characterize the matrix. Hence, only the elements
A(1, 1), …, A(1, N), A(2, 1), …, A(N, 1) need to be stored.
Example
A system of four complex linear equations is solved. Note that only the first row and column of the matrix A
are entered.
USE LSLTC_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
Declare variables
(N=4)
A(2*N-1), B(N), X(N)
Set values for A and B
A = ( 2+2i
( i
( 4+2i
( 3-4i
-3
2+2i
i
4+2i
1+4i
-3
2+2i
i
B = ( 6+65i
-29-16i
7+i
6-2i
1+4i
-3
2+2i
)
)
)
)
-10+i )
!
DATA A/(2.0,2.0), (-3.0,0.0), (1.0,4.0), (6.0,-2.0), (0.0,1.0),&
(4.0,2.0), (3.0,-4.0)/
DATA B/(6.0,65.0), (-29.0,-16.0), (7.0,1.0), (-10.0,1.0)/
Solve AX = B
CALL LSLTC (A, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
!
Output
Output
X
1
(-2.000, 0.000)
2
(-1.000,-5.000)
3
( 7.000, 2.000)
4
( 0.000, 4.000)
LSLTC
Chapter 1: Linear Systems
478
LSLCC
more...
Solves a complex circulant linear system.
Required Arguments
A — Complex vector of length N containing the first row of the coefficient matrix. (Input)
B — Complex vector of length N containing the right-hand side of the linear system. (Input)
X — Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N — Order of the matrix represented by A. (Input)
Default: N = size (A,1).
IPATH — Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
IPATH = 2 means the system ATx = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LSLCC (A, B, X [, …])
Specific:
The specific interface names are S_LSLCC and D_LSLCC.
FORTRAN 77 Interface
Single:
CALL LSLCC (N, A, B, IPATH, X)
Double:
The double precision name is DLSLCC.
Description
Circulant matrices have the property that each row is obtained by shifting the row above it one place to the
right. Entries that are shifted off at the right re-enter at the left. For example,
LSLCC
Chapter 1: Linear Systems
479
S
S
S
S
$
S
S
S
S
S
S
S
S
S
S
S
S
If qk = p −k and the subscripts on p and q are interpreted modulo N, then
1
$[ M
™
1
™T
SLí M [L
L MíL [L
T [L
L where q * x is the convolution of q and x. By the convolution theorem, if q * x = b, then
TA Ռ [A
A
E ZKHUH TA
is the discrete Fourier transform of q as computed by the IMSL routine FFTCF and ⊗ denotes elementwise
multiplication. By division,
[A
A
E Ӆ TA
where ∅ denotes elementwise division. The vector x is recovered from
[A
through the use of IMSL routine FFTCB.
To solve AT x = b, use the vector p instead of q in the above algorithm.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LCC/DL2LCC. The reference is:
CALL L2LCC (N, A, B, IPATH, X, ACOPY, WK)
The additional arguments are as follows:
ACOPY — Complex work vector of length N. If A is not needed, then A and ACOPY may be the
same.
WK — Work vector of length 6N + 15.
2.
3.
Informational error
Type
Code
Description
4
2
The input matrix is singular.
Because of the special structure of circulant matrices, the first row of a circulant matrix completely
characterizes the matrix. Hence, only the elements A(1, 1), …, A(1, N) need to be stored.
LSLCC
Chapter 1: Linear Systems
480
Example
A system of four linear equations is solved. Note that only the first row of the matrix A is entered.
USE LSLCC_INT
USE WRCRN_INT
!
Declare variables
INTEGER
PARAMETER
COMPLEX
N
(N=4)
A(N), B(N), X(N)
!
!
!
!
!
!
!
!
Set values for
A = ( 2+2i -3+0i
A, and B
B = (6+65i
1+4i
-41-10i
6-2i)
-8-30i
63-3i)
DATA A/(2.0,2.0), (-3.0,0.0), (1.0,4.0), (6.0,-2.0)/
DATA B/(6.0,65.0), (-41.0,-10.0), (-8.0,-30.0), (63.0,-3.0)/
Solve AX = B
(IPATH = 1)
CALL LSLCC (A, B, X)
Print results
CALL WRCRN (’X’, X, 1, N, 1)
END
Output
1
(-2.000, 0.000)
2
(-1.000,-5.000)
3
( 7.000, 2.000)
4
( 0.000, 4.000)
LSLCC
Chapter 1: Linear Systems
481
PCGRC
Solves a real symmetric definite linear system using a preconditioned conjugate gradient method with
reverse communication.
Required Arguments
IDO — Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set Z = AP, where A is the
matrix, and call PCGRC again. If the routine returns with IDO = 2, then set Z to the solution of the system MZ = R, where M is the preconditioning matrix, and call PCGRC again. If the routine returns with
IDO = 3, then the iteration has converged and X contains the solution.
X — Array of length N containing the solution. (Input/Output)
On input, X contains the initial guess of the solution. On output, X contains the solution to the system.
P — Array of length N. (Output)
Its use is described under IDO.
R — Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it contains the residual.
Z — Array of length N. (Input)
When IDO = 1, it contains AP, where A is the linear system. When IDO = 2, it contains the solution of
MZ = R, where M is the preconditioning matrix. When IDO = 0, it is ignored. Its use is described under
IDO.
Optional Arguments
N — Order of the linear system. (Input)
Default: N = size (X,1).
RELERR — Relative error desired. (Input)
Default: RELERR = 1.e-5 for single precision and 1.d-10 for double precision.
ITMAX — Maximum number of iterations allowed. (Input)
Default: ITMAX = N.
FORTRAN 90 Interface
Generic:
CALL PCGRC (IDO, X, P, R, Z [, …])
Specific:
The specific interface names are S_PCGRC and D_PCGRC.
FORTRAN 77 Interface
Single:
CALL PCGRC (IDO, N, X, P, R, Z, RELERR, ITMAX)
Double:
The double precision name is DPCGRC.
PCGRC
Chapter 1: Linear Systems
482
Description
Routine PCGRC solves the symmetric definite linear system Ax = b using the preconditioned conjugate gradient method. This method is described in detail by Golub and Van Loan (1983, Chapter 10), and in Hageman
and Young (1981, Chapter 7).
The preconditioning matrix, M, is a matrix that approximates A, and for which the linear system Mz = r is easy
to solve. These two properties are in conflict; balancing them is a topic of much current research.
The number of iterations needed depends on the matrix and the error tolerance RELERR. As a rough guide,
ITMAX = N1/2 is often sufficient when N >> 1. See the references for further information.
Let M be the preconditioning matrix, let b, p, r, x and z be vectors and let τ be the desired relative error. Then
the algorithm used is as follows.
λ = −1
p0 = x0
r1 = b − Ap
For k = 1, …, itmax
zk = M-1rk
If k = 1 then
βk = 1
pk = zk
Else
ȕN
]7N UN ]7NíUNí
SN
]N ȕN SN
End if
zk = Ap
αk = zk-1Trk-1/zkTpk
xk = xk + αkpk
rk = rk − αkzk
If (∥zk∥2 ≤ τ(1 − λ)∥xk∥2) Then
Recompute λ
If (∥zk∥2 ≤ τ(1 − λ)∥xk∥2) Exit
End if
End loop
Here λ is an estimate of λmax(G), the largest eigenvalue of the iteration matrix G = I − M-1 A. The stopping
criterion is based on the result (Hageman and Young, 1981, pages 148−151)
PCGRC
Chapter 1: Linear Systems
483
ӝ[N í [ӝ0
ӝ[ӝ0
ӝ ]N ӝ 0
” í Ȝ *
PD[
ӝ[Nӝ
0
Where
[7 0[
ӝ[ӝ0
It is known that
ȜPD[ 7 ” ȜPD[ 7 ” ֥ ” ȜPD[ * where the Tn are the symmetric, tridiagonal matrices
7Q
ȝ Ȧ Ȧ ȝ Ȧ Ȧ ȝ Ȧ ֧ ֧ ֧
with
ȝN
í ȕN ĮNí í ĮN ȝ
í Į
and
ȦN
ȕN ĮNí
The largest eigenvalue of Tk is found using the routine EVASB. Usually this eigenvalue computation is
needed for only a few of the iterations.
Comments
1.
Workspace may be explicitly provided, if desired, by use of P2GRC/DP2GRC. The reference is:
CALL P2GRC (IDO, N, X, P, R, Z, RELERR, ITMAX, TRI, WK, IWK)
The additional arguments are as follows:
TRI — Workspace of length 2 * ITMAX containing a tridiagonal matrix (in band symmetric form)
whose largest eigenvalue is approximately the same as the largest eigenvalue of the iteration
matrix. The workspace arrays TRI, WK and IWK should not be changed between the initial call
with IDO = 0 and PCGRC/DPCGRC returning with IDO = 3.
WK — Workspace of length 5 * ITMAX.
IWK — Workspace of length ITMAX.
PCGRC
Chapter 1: Linear Systems
484
2.
Informational errors
Type
Code
Description
4
1
The preconditioning matrix is singular.
4
2
The preconditioning matrix is not definite.
4
3
The linear system is not definite.
4
4
The linear system is singular.
4
5
No convergence after ITMAX iterations.
Examples
Example 1
In this example, the solution to a linear system is found. The coefficient matrix A is stored as a full matrix.
The preconditioning matrix is the diagonal of A. This is called the Jacobi preconditioner. It is also used by the
IMSL routine JCGRC.
USE PCGRC_INT
USE MURRV_INT
USE WRRRN_INT
USE SCOPY_INT
INTEGER
LDA, N
PARAMETER (N=3, LDA=N)
!
INTEGER
REAL
!
!
!
DATA
!
DATA
!
CALL
!
CALL
IDO, ITMAX, J
A(LDA,N), B(N), P(N), R(N), X(N), Z(N)
(
1, -3,
2
)
A =
( -3, 10, -5
)
(
2, -5,
6
)
A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
B =
(
27.0, -78.0, 64.0 )
B/27.0, -78.0, 64.0/
Set R to right side
SCOPY (N, B, 1, R, 1)
Initial guess for X is B
SCOPY (N, B, 1, X, 1)
!
!
!
!
!
ITMAX = 100
IDO
= 0
10 CALL PCGRC (IDO, X, P, R, Z, ITMAX=ITMAX)
IF (IDO .EQ. 1) THEN
Set z = Ap
CALL MURRV (A, P, Z)
GO TO 10
ELSE IF (IDO .EQ. 2) THEN
Use diagonal of A as the
preconditioning matrix M
and set z = inv(M)*r
DO 20 J=1, N
Z(J) = R(J)/A(J,J)
PCGRC
Chapter 1: Linear Systems
485
20
CONTINUE
GO TO 10
END IF
!
Print the solution
CALL WRRRN (’Solution’, X)
!
END
Output
Solution
1
1.001
2 -4.000
3
7.000
Example 2
In this example, a more complicated preconditioner is used to find the solution of a linear system which
occurs in a finite-difference solution of Laplace’s equation on a 4 × 4 grid. The matrix is
$
í í
í í í
í í í
í í í í
í í í í
í í í í
í í í í í í
í í The preconditioning matrix M is the symmetric tridiagonal part of A,
0
í
í í
í í
í í
í í
í í
í í
í í
í Note that M, called PRECND in the program, is factored once.
USE IMSL_LIBRARIES
INTEGER
LDA, LDPRE, N, NCODA, NCOPRE
PARAMETER (N=9, NCODA=3, NCOPRE=1, LDA=2*NCODA+1,&
LDPRE=NCOPRE+1)
!
PCGRC
Chapter 1: Linear Systems
486
INTEGER
REAL
!
!
!
!
!
IDO, ITMAX
A(LDA,N), P(N), PRECND(LDPRE,N), PREFAC(LDPRE,N),&
R(N), RCOND, RELERR, X(N), Z(N)
Set A in band form
DATA A/3*0.0, 4.0, -1.0, 0.0, -1.0, 2*0.0, -1.0, 4.0, -1.0, 0.0,&
-1.0, 2*0.0, -1.0, 4.0, -1.0, 0.0, -1.0, -1.0, 0.0, -1.0,&
4.0, -1.0, 0.0, -1.0, -1.0, 0.0, -1.0, 4.0, -1.0, 0.0,&
-1.0, -1.0, 0.0, -1.0, 4.0, -1.0, 0.0, -1.0, -1.0, 0.0,&
-1.0, 4.0, -1.0, 2*0.0, -1.0, 0.0, -1.0, 4.0, -1.0, 2*0.0,&
-1.0, 0.0, -1.0, 4.0, 3*0.0/
Set PRECND in band symmetric form
DATA PRECND/0.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0,&
-1.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0/
Right side is (1, ..., 1)
R = 1.0E0
Initial guess for X is 0
X = 0.0E0
Factor the preconditioning matrix
CALL LFCQS (PRECND, NCOPRE, PREFAC, RCOND)
!
!
!
!
ITMAX = 100
RELERR = 1.0E-4
IDO
= 0
10 CALL PCGRC (IDO, X, P, R, Z, RELERR=RELERR, ITMAX=ITMAX)
IF (IDO .EQ. 1) THEN
Set z = Ap
CALL MURBV (A, NCODA, NCODA, P, Z)
GO TO 10
ELSE IF (IDO .EQ. 2) THEN
Solve PRECND*z = r for r
CALL LSLQS (PREFAC, NCOPRE, R, Z)
GO TO 10
END IF
Print the solution
CALL WRRRN (’Solution’, X)
!
END
Output
Solution
1
0.955
2
1.241
3
1.349
4
1.578
5
1.660
6
1.578
7
1.349
8
1.241
9
0.955
PCGRC
Chapter 1: Linear Systems
487
JCGRC
Solves a real symmetric definite linear system using the Jacobi-preconditioned conjugate gradient method
with reverse communication.
Required Arguments
IDO — Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set
Z = A * P, where A is the matrix, and call JCGRC again. If the routine returns with IDO = 2, then the iteration has converged and X contains the solution.
DIAGNL — Vector of length N containing the diagonal of the matrix. (Input)
Its elements must be all strictly positive or all strictly negative.
X — Array of length N containing the solution. (Input/Output)
On input, X contains the initial guess of the solution. On output, X contains the solution to the system.
P — Array of length N. (Output)
Its use is described under IDO.
R — Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it contains the residual.
Z — Array of length N. (Input)
When IDO = 1, it contains AP, where A is the linear system. When IDO = 0, it is ignored. Its use is
described under IDO.
Optional Arguments
N — Order of the linear system. (Input)
Default: N = size (X,1).
RELERR — Relative error desired. (Input)
Default: RELERR = 1.e-5 for single precision and 1.d-10 for double precision.
ITMAX — Maximum number of iterations allowed. (Input)
Default: ITMAX = 100.
FORTRAN 90 Interface
Generic:
CALL JCGRC (IDO, DIAGNL, X, P, R, Z [, …])
Specific:
The specific interface names are S_JCGRC and D_JPCGRC.
FORTRAN 77 Interface
Single:
CALL JCGRC (IDO, N, DIAGNL, X, P, R, Z, RELERR, ITMAX)
Double:
The double precision name is DJCGRC.
JCGRC
Chapter 1: Linear Systems
488
Description
Routine JCGRC solves the symmetric definite linear system Ax = b using the Jacobi conjugate gradient
method. This method is described in detail by Golub and Van Loan (1983, Chapter 10), and in Hageman and
Young (1981, Chapter 7).
This routine is a special case of the routine PCGRC, with the diagonal of the matrix A used as the preconditioning matrix. For details of the algorithm see PCGRC.
The number of iterations needed depends on the matrix and the error tolerance RELERR. As a rough guide,
ITMAX = N is often sufficient when N ≫ 1. See the references for further information.
Comments
1.
Workspace may be explicitly provided, if desired, by use of J2GRC/DJ2GRC. The reference is:
CALL J2GRC (IDO, N, DIAGNL, X, P, R, Z, RELERR, ITMAX, TRI, WK, IWK)
The additional arguments are as follows:
TRI — Workspace of length 2 * ITMAX containing a tridiagonal matrix (in band symmetric form)
whose largest eigenvalue is approximately the same as the largest eigenvalue of the iteration
matrix. The workspace arrays TRI, WK and IWK should not be changed between the initial call
with IDO = 0 and JCGRC/DJCGRC returning with IDO = 2.
WK — Workspace of length 5 * ITMAX.
IWK — Workspace of length ITMAX.
2.
Informational errors
Type
Code
Description
4
1
The diagonal contains a zero.
4
2
The diagonal elements have different signs.
4
3
No convergence after ITMAX iterations.
4
4
The linear system is not definite.
4
5
The linear system is singular.
Example
In this example, the solution to a linear system is found. The coefficient matrix A is stored as a full matrix.
USE IMSL_LIBRARIES
INTEGER
LDA, N
PARAMETER (LDA=3, N=3)
!
INTEGER
REAL
!
!
!
IDO, ITMAX
A(LDA,N), B(N), DIAGNL(N), P(N), R(N), X(N), &
Z(N)
(
1, -3,
2
)
A =
( -3, 10, -5
)
(
2, -5,
6
)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
JCGRC
Chapter 1: Linear Systems
489
!
B =
DATA B/27.0, -78.0, 64.0/
!
(
27.0, -78.0, 64.0
)
Set R to right side
CALL SCOPY (N, B, 1, R, 1)
!
Initial guess for X is B
CALL SCOPY (N, B, 1, X, 1)
!
!
!
!
Copy diagonal of A to DIAGNL
CALL SCOPY (N, A(:, 1), LDA+1, DIAGNL, 1)
Set parameters
ITMAX = 100
IDO
= 0
10 CALL JCGRC (IDO, DIAGNL, X, P, R, Z, ITMAX=ITMAX)
IF (IDO .EQ. 1) THEN
Set z = Ap
CALL MURRV (A, P, Z)
GO TO 10
END IF
Print the solution
CALL WRRRN (’Solution’, X)
!
END
Output
Solution
1
1.001
2 -4.000
3
7.000
JCGRC
Chapter 1: Linear Systems
490
GMRES
Uses the Generalized Minimal Residual Method with reverse communication to generate an approximate
solution of Ax = b.
Required Arguments
IDO— Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set Z = AP, where A is the
matrix, and call GMRES again. If the routine returns with IDO = 2, then set Z to the solution of the system MZ = P, where M is the preconditioning matrix, and call GMRES again. If the routine returns with
IDO = 3, set Z = AM-1P, and call GMRES again. If the routine returns with IDO = 4, the iteration has converged, and X contains the approximate solution to the linear system.
X — Array of length N containing an approximate solution. (Input/Output)
On input, X contains an initial guess of the solution. On output, X contains the approximate solution.
P — Array of length N. (Output)
Its use is described under IDO.
R — Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it contains the residual,
b − Ax.
Z — Array of length N. (Input)
When IDO = 1, it contains AP, where A is the coefficient matrix. When IDO = 2, it contains M-1P. When
IDO = 3, it contains AM-1P. When IDO = 0, it is ignored.
TOL — Stopping tolerance. (Input/Output)
The algorithm attempts to generate a solution x such that ∣b − Ax∣ ≤ TOL*∣b∣. On output, TOL contains the final residual norm.
Optional Arguments
N — Order of the linear system. (Input)
Default: N = size (X,1).
FORTRAN 90 Interface
Generic:
CALL GMRES (IDO, X, P, R, Z, TOL [, …])
Specific:
The specific interface names are S_GMRES and D_GMRES.
FORTRAN 77 Interface
Single:
CALL GMRES (IDO, N, X, P, R, Z, TOL)
Double:
The double precision name is DGMRES.
GMRES
Chapter 1: Linear Systems
491
Description
The routine GMRES implements restarted GMRES with reverse communication to generate an approximate
solution to Ax = b. It is based on GMRESD by Homer Walker.
There are four distinct GMRES implementations, selectable through the parameter vector INFO. The first
Gram-Schmidt implementation, INFO(1) = 1, is essentially the original algorithm by Saad and Schultz
(1986). The second Gram-Schmidt implementation, developed by Homer Walker and Lou Zhou, is simpler
than the first implementation. The least squares problem is constructed in upper-triangular form and the
residual vector updating at the end of a GMRES cycle is cheaper. The first Householder implementation is
algorithm 2.2 of Walker (1988), but with more efficient correction accumulation at the end of each GMRES
cycle. The second Householder implementation is algorithm 3.1 of Walker (1988). The products of Householder transformations are expanded as sums, allowing most work to be formulated as large scale matrixvector operations. Although BLAS are used wherever possible, extensive use of Level 2 BLAS in the second
Householder implementation may yield a performance advantage on certain computing environments.
The Gram-Schmidt implementations are less expensive than the Householder, the latter requiring about
twice as much arithmetic beyond the coefficient matrix/vector products. However, the Householder implementations may be more reliable near the limits of residual reduction. See Walker (1988) for details. Issues
such as the cost of coefficient matrix/vector products, availability of effective preconditioners, and features
of particular computing environments may serve to mitigate the extra expense of the Householder
implementations.
Comments
1.
Workspace may be explicitly provided, if desired, by use of G2RES/DG2RES. The reference is:
CALL G2RES (IDO, N, X, P, R, Z, TOL, INFO, USRNPR, USRNRM, WORK)
The additional arguments are as follows:
INFO — Integer vector of length 10 used to change parameters of GMRES. (Input/Output).
For any components INFO(1) ... INFO(7) with value zero on input, the default value is used.
INFO(1) = IMP, the flag indicating the desired implementation.
IMP
Action
1
first Gram-Schmidt implementation
2
second Gram-Schmidt implementation
3
first Householder implementation
4
second Householder implementation
Default: IMP = 1
INFO(2) = KDMAX, the maximum Krylor subspace dimension, i.e., the maximum allowable number of GMRES iterations before restarting. It must satisfy 1 ≤ KDMAX ≤ N.
Default: KDMAX = min(N, 20)
INFO(3) = ITMAX, the maximum number of GMRES iterations allowed.
Default: ITMAX = 1000
GMRES
Chapter 1: Linear Systems
492
INFO(4) = IRP, the flag indicating whether right preconditioning is used.
If IRP = 0, no right preconditioning is performed. If IRP = 1, right preconditioning is performed. If IRP = 0, then IDO = 2 or 3 will not occur.
Default: IRP = 0
INFO(5) = IRESUP, the flag that indicates the desired residual vector updating prior to restarting
or on termination.
IRESUP
Action
1
update by linear combination, restarting only
2
update by linear combination, restarting and
termination
3
update by direct evaluation, restarting only
4
update by direct evaluation, restarting and
termination
Updating by direct evaluation requires an otherwise unnecessary matrix-vector product. The
alternative is to update by forming a linear combination of various available vectors. This may
or may not be cheaper and may be less reliable if the residual vector has been greatly reduced.
If IRESUP = 2 or 4, then the residual vector is returned in WORK(1), ..., WORK(N). This is useful in some applications but costs another unnecessary residual update. It is recommended
that IRESUP = 1 or 2 be used, unless matrix-vector products are inexpensive or great residual
reduction is required. In this case use IRESUP = 3 or 4. The meaning of “inexpensive” varies
with IMP as follows:
IMP
≤
1
(KDMAX + 1) *N flops
2
N flops
3
(2*KDMAX + 1) *N flops
4
(2*KDMAX + 1) *N flops
“Great residual reduction” means that TOL is only a few orders of magnitude larger than
machine epsilon.
Default: IRESUP = 1
INFO(6) = flag for indicating the inner product and norm used in the Gram-Schmidt implementations. If INFO(6) = 0, sdot and snrm2, from BLAS, are used. If INFO(6) = 1, the user must
provide the routines, as specified under arguments USRNPR and USRNRM.
Default: INFO(6) = 0
INFO(7) = IPRINT, the print flag. If IPRINT = 0, no printing is performed. If
IPRINT = 1, print the iteration numbers and residuals.
Default: IPRINT = 0
INFO(8) = the total number of GMRES iterations on output.
INFO(9) = the total number of matrix-vector products in GMRES on output.
INFO(10) = the total number of right preconditioner solves in GMRES on output if IRP = 1.
GMRES
Chapter 1: Linear Systems
493
USRNPR — User-supplied FUNCTION to use as the inner product in the Gram-Schmidt implementation, if INFO(6) = 1. If INFO(6) = 0, the dummy function G8RES/DG8RES may be used. The
usage is
REAL FUNCTION USRNPR (N, SX, INCX, SY, INCY)
N — Length of vectors X and Y. (Input)
SX — Real vector of length MAX(N*IABS(INCX),1). (Input)
INCX — Displacement between elements of SX. (Input)
X(I) is defined to be SX(1+(I-1)*INCX) if INCX is greater than 0, or
SX(1+(I-N)*INCX) if INCX is less than 0.
SY — Real vector of length MAX(N*IABS(INXY),1). (Input)
INCY — Displacement between elements of SY. (Input)
Y(I) is defined to be SY(1+(I-1)*INCY) if INCY is greater than 0, or SY(1+(I-N)*INCY) if
INCY is less than zero.
USRNPR must be declared EXTERNAL in the calling program.
USRNRM — User-supplied FUNCTION to use as the norm ∥X∥ in the Gram-Schmidt implementation,
if INFO(6) = 1. If INFO(6) = 0, the dummy function G9RES/DG9RES may be used.The usage is
REAL FUNCTION USRNRM (N, SX, INCX)
N — Length of vectors X and Y. (Input)
SX — Real vector of length MAX(N*IABS(INCX),1). (Input)
INCX — Displacement between elements of SX. (Input)
X(I) is defined to be SX(1+(I-1)*INCX) if INCX is greater than 0, or SX(1+(I-N)*INCX) if
INCX is less than 0.
USRNRM must be declared EXTERNAL in the calling program.
WORK — Work array whose length is dependent on the chosen implementation.
IMP
length of WORK
1
N*(KDMAX + 2) + KDMAX**2 + 3 *KDMAX + 2
2
N*(KDMAX + 2) + KDMAX**2 + 2 *KDMAX + 1
3
N*(KDMAX + 2) + 3 *KDMAX + 2
4
N*(KDMAX + 2) + KDMAX**2 + 2 *KDMAX + 2
Examples
Example 1
This is a simple example of GMRES usage. A solution to a small linear system is found. The coefficient matrix
A is stored as a full matrix, and no preconditioning is used. Typically, preconditioning is required to achieve
convergence in a reasonable number of iterations.
!
USE IMSL_LIBRARIES
Declare variables
INTEGER
LDA, N
PARAMETER (N=3, LDA=N)
GMRES
Chapter 1: Linear Systems
494
!
INTEGER
REAL
REAL
SAVE
!
INTRINSIC
REAL
!
!
!
!
!
!
Specifications for local variables
IDO, NOUT
P(N), TOL, X(N), Z(N)
A(LDA,N), R(N)
A, R
Specifications for intrinsics
SQRT
SQRT
( 33.0 16.0 72.0)
A = (-24.0 -10.0 -57.0)
( 18.0 -11.0
7.0)
B = (129.0 -96.0
8.5)
DATA A/33.0, -24.0, 18.0, 16.0, -10.0, -11.0, 72.0, -57.0, 7.0/
DATA R/129.0, -96.0, 8.5/
!
CALL UMACH (2, NOUT)
!
!
!
Initial guess = (0 ... 0)
X = 0.0E0
!
!
!
Set stopping tolerance to
square root of machine epsilon
TOL = AMACH(4)
TOL = SQRT(TOL)
IDO = 0
10 CONTINUE
CALL GMRES (IDO, X, P, R, Z, TOL)
IF (IDO .EQ. 1) THEN
Set z = A*p
CALL MURRV (A, P, Z)
GO TO 10
END IF
!
CALL WRRRN ('Solution', X, 1, N, 1)
WRITE (NOUT,'(A11, E15.5)') 'Residual = ', TOL
END
Output
Solution
1
2
3
1.000
1.500
1.000
Residual =
0.29746E-05
Example 2
This example solves a linear system with a coefficient matrix stored in coordinate form, the same problem as
in the document example for LSLXG. Jacobi preconditioning is used, i.e. the preconditioning matrix M is the
diagonal matrix with Mii = Aii, for i = 1, …, n.
USE IMSL_LIBRARIES
GMRES
Chapter 1: Linear Systems
495
INTEGER
N, NZ
PARAMETER
(N=6, NZ=15)
!
Specifications for
IDO, INFO(10), NOUT
P(N), TOL, WORK(1000), X(N), Z(N)
DIAGIN(N), R(N)
Specifications for
SQRT
SQRT
Specifications for
AMULTP
Specifications for
G8RES, G9RES
INTEGER
REAL
REAL
!
INTRINSIC
REAL
!
EXTERNAL
!
EXTERNAL
local variables
intrinsics
subroutines
functions
!
DATA DIAGIN/0.1, 0.1, 0.0666667, 0.1, 1.0, 0.16666667/
DATA R/10.0, 7.0, 45.0, 33.0, -34.0, 31.0/
!
CALL UMACH (2, NOUT)
!
Initial guess = (1 ... 1)
X = 1.0E0
!
!
Set up the options vector INFO
to use preconditioning
INFO = 0
INFO(4) = 1
!
!
!
!
!
!
!
!
Set stopping tolerance to
square root of machine epsilon
TOL = AMACH(4)
TOL = SQRT(TOL)
IDO = 0
10 CONTINUE
CALL G2RES (IDO, N, X, P, R, Z, TOL, INFO, G8RES, G9RES, WORK)
IF (IDO .EQ. 1) THEN
Set z = A*p
CALL AMULTP (P, Z)
GO TO 10
ELSE IF (IDO .EQ. 2) THEN
Set z = inv(M)*p
The diagonal of inv(M) is stored
in DIAGIN
CALL SHPROD (N, DIAGIN, 1, P, 1, Z, 1)
GO TO 10
ELSE IF (IDO .EQ. 3) THEN
!
!
!
Set z = A*inv(M)*p
CALL SHPROD (N, DIAGIN, 1, P, 1, Z, 1)
P = Z
CALL AMULTP (P, Z)
GO TO 10
END IF
!
GMRES
Chapter 1: Linear Systems
496
CALL WRRRN ('Solution', X)
WRITE (NOUT,'(A11, E15.5)') 'Residual = ', TOL
END
!
SUBROUTINE AMULTP (P, Z)
USE IMSL_LIBRARIES
INTEGER
NZ
PARAMETER (NZ=15)
!
SPECIFICATIONS FOR ARGUMENTS
REAL
P(*), Z(*)
INTEGER
PARAMETER
N
(N=6)
!
SPECIFICATIONS FOR PARAMETERS
!
INTEGER
INTEGER
REAL
SAVE
!
!
!
SPECIFICATIONS FOR LOCAL VARIABLES
I
IROW(NZ), JCOL(NZ)
A(NZ)
A, IROW, JCOL
SPECIFICATIONS FOR SUBROUTINES
Define the matrix A
DATA A/6.0, 10.0, 15.0, -3.0, 10.0, -1.0, -1.0, -3.0, -5.0, 1.0, &
10.0, -1.0, -2.0, -1.0, -2.0/
DATA IROW/6, 2, 3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
DATA JCOL/6, 2, 3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
!
CALL SSET(N, 0.0, Z, 1)
!
Accumulate the product A*p in z
DO 10 I=1, NZ
Z(IROW(I)) = Z(IROW(I)) + A(I)*P(JCOL(I))
10 CONTINUE
RETURN
END
Output
Solution
1
1.000
2
2.000
3
3.000
4
4.000
5
5.000
6
6.000
Residual =
0.25882E-05
Example 3
The coefficient matrix in this example corresponds to the five-point discretization of the 2-d Poisson equation
with the Dirichlet boundary condition. Assuming the natural ordering of the unknowns, and moving all
boundary terms to the right hand side, we obtain the block tridiagonal matrix
GMRES
Chapter 1: Linear Systems
497
7 í,
í, ֧
֧
$
֧
֧ í,
í, 7
where
í
í ֧ ֧
֧ ֧ í
í 7
and I is the identity matrix. Discretizing on a k × k grid implies that T and I are both k × k, and thus the coefficient matrix A is k2 × k2.
The problem is solved twice, with discretization on a 50 × 50 grid. During both solutions, use the second
Householder implementation to take advantage of the large scale matrix/vector operations done in Level 2
BLAS. Also choose to update the residual vector by direct evaluation since the small tolerance will require
large residual reduction.
The first solution uses no preconditioning. For the second solution, we construct a block diagonal preconditioning matrix
7
0
֧
7
M is factored once, and these factors are used in the forward solves and back substitutions necessary when
GMRES returns with IDO = 2 or 3.
Timings are obtained for both solutions, and the ratio of the time for the solution with no preconditioning to
the time for the solution with preconditioning is printed. Though the exact results are machine dependent,
we see that the savings realized by faster convergence from using a preconditioner exceed the cost of factoring M and performing repeated forward and back solves.
USE IMSL_LIBRARIES
INTEGER
K, N
PARAMETER (K=50, N=K*K)
!
Specifications for local variables
IDO, INFO(10), IR(20), IS(20), NOUT
A(2*N), B(2*N), C(2*N), G8RES, G9RES, P(2*N), R(N), &
TNOPRE, TOL, TPRE, U(2*N), WORK(100000), X(N), &
Y(2*N), Z(2*N)
Specifications for subroutines
EXTERNAL
AMULTP, G8RES, G9RES
Specifications for functions
CALL UMACH (2, NOUT)
Right hand side and initial guess
to (1 ... 1)
R = 1.0E0
INTEGER
REAL
!
!
!
!
GMRES
Chapter 1: Linear Systems
498
X = 1.0E0
!
!
!
Use the 2nd Householder
implementation and update the
residual by direct evaluation
INFO = 0
INFO(1) =
INFO(5) =
TOL
=
TOL
=
IDO
=
4
3
AMACH(4)
100.0*TOL
0
!
!
Time the solution with no
preconditioning
TNOPRE = CPSEC()
10 CONTINUE
CALL G2RES (IDO, N, X, P, R, Z, TOL, INFO, G8RES, G9RES, WORK)
IF (IDO .EQ. 1) THEN
!
!
!
Set z = A*p
CALL AMULTP (K, P, Z)
GO TO 10
END IF
TNOPRE = CPSEC() - TNOPRE
!
WRITE (NOUT,'(A32, I4)') 'Iterations, no preconditioner = ', &
INFO(8)
!
!
!
Solve again using the diagonal blocks
of A as the preconditioning matrix M
R = 1.0E0
X = 1.0E0
!
Define M
CALL SSET
CALL SSET
CALL SSET
INFO(4) =
TOL
=
TOL
=
IDO
=
TPRE
=
(N-1, -1.0, B, 1)
(N-1, -1.0, C, 1)
(N, 4.0, A, 1)
1
AMACH(4)
100.0*TOL
0
CPSEC()
!
!
Compute the LDU factorization of M
CALL LSLCR (C, A, B, Y, U, IR, IS, IJOB=6)
20 CONTINUE
CALL G2RES (IDO, N, X, P, R, Z, TOL, INFO, G8RES, G9RES, WORK)
IF (IDO .EQ. 1) THEN
!
!
!
Set z = A*p
CALL AMULTP (K, P, Z)
GO TO 20
ELSE IF (IDO .EQ. 2) THEN
!
!
!
Set z = inv(M)*p
GMRES
Chapter 1: Linear Systems
499
CALL SCOPY (N, P, 1, Z, 1)
CALL LSLCR (C, A, B, Z, U, IR, IS, IJOB=5)
GO TO 20
ELSE IF (IDO .EQ. 3) THEN
!
!
!
Set z = A*inv(M)*p
CALL LSLCR (C, A, B, P, U, IR, IS, IJOB=5)
CALL AMULTP (K, P, Z)
GO TO 20
END IF
TPRE = CPSEC() - TPRE
WRITE (NOUT,'(A35, I4)') 'Iterations, with preconditioning = ',&
INFO(8)
WRITE (NOUT,'(A45, F10.5)') '(Precondition time)/(No '// &
'precondition time) = ', TPRE/TNOPRE
!
END
!
SUBROUTINE AMULTP (K, P, Z)
USE IMSL_LIBRARIES
!
Specifications for arguments
INTEGER
REAL
K
P(*), Z(*)
INTEGER
I, N
!
Specifications for local variables
!
N = K*K
!
!
Multiply by diagonal blocks
CALL SVCAL (N, 4.0, P, 1, Z, 1)
CALL SAXPY (N-1, -1.0, P(2:(N)), 1, Z, 1)
CALL SAXPY (N-1, -1.0, P, 1, Z(2:(N)), 1)
!
!
!
Correct for terms not properly in
block diagonal
DO 10 I=K, N - K, K
Z(I)
= Z(I) + P(I+1)
Z(I+1) = Z(I+1) + P(I)
10 CONTINUE
!
!
!
Do the super and subdiagonal blocks,
the -I's
CALL SAXPY (N-K, -1.0, P((K+1):(N)), 1, Z, 1)
CALL SAXPY (N-K, -1.0, P, 1, Z((K+1):(N)), 1)
!
RETURN
END
Output
Iterations, no preconditioner = 329
Iterations, with preconditioning = 192
(Precondition time)/(No precondition time) =
0.66278
GMRES
Chapter 1: Linear Systems
500
GMRES
Chapter 1: Linear Systems
501
ARPACK_SVD
Computes some singular values and left and right singular vectors of a real rectangular matrix
AMxN = USVT. There is no restriction on the relative sizes, M and N. The user supplies matrix-vector products y = Ax and y = ATx for the iterative method. This routine calls ARPACK_SYMMETRIC. Descriptions for
both ARPACK_SVD and ARPACK_SYMMETRIC are found in Chapter 2, “Eigensystem Analysis”.
ARPACK_SVD
Chapter 1: Linear Systems
502
LSQRR
more...
more...
Solves a linear least-squares problem without iterative refinement.
Required Arguments
A — NRA by NCA matrix containing the coefficient matrix of the least-squares system to be solved. (Input)
B — Vector of length NRA containing the right-hand side of the least-squares system. (Input)
X — Vector of length NCA containing the solution vector with components corresponding to the columns
not used set to zero. (Output)
RES — Vector of length NRA containing the residual vector B - A * X. (Output)
KBASIS — Scalar containing the number of columns used in the solution.
Optional Arguments
NRA — Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
TOL — Scalar containing the nonnegative tolerance used to determine the subset of columns of A to be
included in the solution. (Input)
If TOL is zero, a full complement of min(NRA, NCA) columns is used. See Comments.
Default: TOL = 0.0
FORTRAN 90 Interface
Generic:
CALL LSQRR (A, B, X, RES, KBASIS [, …])
Specific:
The specific interface names are S_LSQRR and D_LSQRR.
FORTRAN 77 Interface
Single:
CALL LSQRR (NRA, NCA, A, LDA, B, TOL, X, RES, KBASIS)
Double:
The double precision name is DLSQRR.
LSQRR
Chapter 1: Linear Systems
503
ScaLAPACK Interface
Generic:
CALL LSQRR (A0, B0, X0, RES0, KBASIS [ , …])
Specific:
The specific interface names are S_LSQRR and D_LSQRR.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Routine LSQRR solves the linear least-squares problem. The underlying code is based on either LINPACK,
LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of
this manual. The routine LQRRR is first used to compute the QR decomposition of A. Pivoting, with all rows
free, is used. Column k is in the basis if
ӛ5NN ӛ ” IJ ӛ5ӛ
with τ = TOL. The truncated least-squares problem is then solved using IMSL routine LQRSL. Finally, the
components in the solution, with the same index as columns that are not in the basis, are set to zero; and then,
the permutation determined by the pivoting in IMSL routine LQRRR is applied.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2QRR/DL2QRR. The reference is:
CALL L2QRR (NRA, NCA, A, LDA, B, TOL, X, RES, KBASIS, QR, QRAUX, IPVT, WORK)
The additional arguments are as follows:
QR — Work vector of length NRA * NCA representing an NRA by NCA matrix that contains information from the QR factorization of A. The upper trapezoidal part of QR contains the upper
trapezoidal part of R with its diagonal elements ordered in decreasing magnitude. The strict
lower trapezoidal part of QR contains information to recover the orthogonal matrix Q of the
factorization. If A is not needed, QR can share the same storage locations as A.
QRAUX — Work vector of length NCA containing information about the orthogonal factor of the
QR factorization of A.
IPVT — Integer work vector of length NCA containing the pivoting information for the QR factorization of A.
WORK — Work vector of length 2 * NCA - 1.
2.
Routine LSQRR calculates the QR decomposition with pivoting of a matrix A and tests the diagonal elements against a user-supplied tolerance TOL. The first integer KBASIS = k is determined for which
∣rk+1,k+1∣ ≤ TOL * ∣r11∣
In effect, this condition implies that a set of columns with a condition number approximately bounded
by 1.0/TOL is used. Then, LQRSL performs a truncated fit of the first KBASIS columns of the permuted
A to an input vector B. The coefficient of this fit is unscrambled to correspond to the original columns
of A, and the coefficients corresponding to unused columns are set to zero. It may be helpful to scale
the rows and columns of A so that the error estimates in the elements of the scaled matrix are roughly
equal to TOL.
3.
Integer Options with Chapter 11 Options Manager
LSQRR
Chapter 1: Linear Systems
504
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2QRR the leading dimension of QR is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSQRR. Additional memory allocation for QR and option value restoration are
done automatically in LSQRR. Users directly calling L2QRR can allocate additional space for QR
and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies. There
is no requirement that users change existing applications that use LSQRR or L2QRR. Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSQRR temporarily replaces IVAL(2) by IVAL(1). The routine L2CRG computes the condition
number if IVAL(2) = 2. Otherwise L2CRG skips this computation. LSQRR restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the coefficient matrix of the least squares system to be solved. (Input)
B0 — Local vector of length MXLDA containing the local portions of the distributed vector B. B contains
the right-hand side of the least squares system. (Input)
X0 — Local vector of length MXLDX containing the local portions of the distributed vector X. X contains
the solution vector with components corresponding to the columns not used set to zero. (Output)
RES0 — Local vector of length MXLDA containing the local portions of the distributed vector RES. RES
contains the residual vector B – A * X. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA, MXLDX, and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the
ScaLAPACK Example below.
Examples
Example 1
Consider the problem of finding the coefficients ci in
f(x) = c0 + c1x + c2x2
given data at x = 1, 2, 3 and 4, using the method of least squares. The row of the matrix A contains the value
of 1, x and x2 at the data points. The vector b contains the data, chosen such that c0 ≈ 1, c1 ≈ 2 and c2 ≈ 0. The
routine LSQRR solves this least-squares problem.
USE LSQRR_INT
USE UMACH_INT
USE WRRRN_INT
!
PARAMETER
REAL
Declare variables
(NRA=4, NCA=3, LDA=NRA)
A(LDA,NCA), B(NRA), X(NCA), RES(NRA), TOL
LSQRR
Chapter 1: Linear Systems
505
!
!
!
!
!
!
!
!
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
Set values for B
DATA B/ 4.999,
!
!
!
9.001,
12.999,
17.001 /
Solve the least squares problem
TOL = 1.0E-4
CALL LSQRR (A, B, X, RES, KBASIS, TOL=TOL)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ’KBASIS = ’, KBASIS
CALL WRRRN (’X’, X, 1, NCA, 1)
CALL WRRRN (’RES’, RES, 1, NRA, 1)
!
END
Output
KBASIS =
1
0.999
3
X
2
2.000
3
0.000
RES
1
-0.000400
2
0.001200
3
-0.001200
4
0.000400
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. Consider the problem of finding
the coefficients ci in
f(x) = c0 + c1x + c2x2
given data at x = 1, 2, 3 and 4, using the method of least squares. The row of the matrix A contains the value
of 1, x and x2 at the data points. The vector b contains the data, chosen such that c0 ≈ 1, c1 ≈ 2 and c2 ≈ 0. The
routine LSQRR solves this least-squares problem. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (se e Chapter 19, “Utilities” used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
LSQRR
Chapter 1: Linear Systems
506
USE MPI_SETUP_INT
USE LSQRR_INT
USE UMACH_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE 'mpif.h'
!
!
!
Declare variables
INTEGER
LDA, NRA, NCA, DESCA(9), DESCX(9), DESCR(9)
INTEGER
INFO, KBASIS, MXCOL, MXLDA, MXCOLX, MXLDX, NOUT
REAL
TOL
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:), RES(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:), RES0(:)
PARAMETER (NRA=4, NCA=3, LDA=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), B(NRA), X(NCA), RES(NRA))
Set values for A and B
A(1,:) = (/ 1.0, 2.0,
4.0/)
A(2,:) = (/ 1.0, 4.0, 16.0/)
A(3,:) = (/ 1.0, 6.0, 36.0/)
A(4,:) = (/ 1.0, 8.0, 64.0/)
!
B = (/4.999, 9.001,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
12.999, 17.001/)
Set up a 2D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .FALSE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDX, and MXCOLX
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(NCA, 1, MP_NB, 1, MXLDX, MXCOLX)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, &
INFO)
CALL DESCINIT(DESCX, NCA, 1, MP_NB, 1, 0, 0, MP_ICTXT, MXLDX, INFO)
CALL DESCINIT(DESCR, NRA, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDX), RES0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCR, B0)
Solve the least squares problem
TOL = 1.0E-4
CALL LSQRR (A0, B0, X0, RES0, KBASIS, TOL=TOL)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
CALL SCALAPACK_UNMAP(RES0, DESCR, RES)
Print results.
Only Rank=0 has the solution.
LSQRR
Chapter 1: Linear Systems
507
!
!
IF(MP_RANK .EQ. 0)THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'KBASIS = ', KBASIS
CALL WRRRN ('X', X, 1, NCA, 1)
CALL WRRRN ('RES', RES, 1, NRA, 1)
ENDIF
IF (MP_RANK .EQ. 0) DEALLOCATE(A, B, RES, X)
DEALLOCATE(A0, B0, RES0, X0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP('FINAL')
END
Output
KBASIS =
1
0.999
3
X
2
2.000
3
0.000
RES
1
-0.000400
2
0.001200
3
-0.001200
4
0.000400
LSQRR
Chapter 1: Linear Systems
508
LQRRV
more...
more...
Computes the least-squares solution using Householder transformations applied in blocked form.
Required Arguments
A — Real LDA by (NCA + NUMEXC) array containing the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved
upon output. The Householder factorization of the matrix is computed and used to solve the systems.
X — Real LDX by NUMEXC array containing the solution. (Output)
Optional Arguments
NRA — Number of rows in the matrix. (Input)
Default: NRA = size (A,1).
NCA — Number of columns in the matrix. (Input)
Default: NCA = size (A,2) - NUMEXC.
NUMEXC — Number of right-hand sides. (Input)
Default: NUMEXC = size (X,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
LDX — Leading dimension of the solution array X exactly as specified in the dimension statement of the
calling program. (Input)
Default: LDX = size (X,1).
FORTRAN 90 Interface
Generic:
CALL LQRRV (A, X [, …])
Specific:
The specific interface names are S_LQRRV and D_LQRRV.
FORTRAN 77 Interface
Single:
CALL LQRRV (NRA, NCA, NUMEXC, A, LDA, X, LDX)
Double:
The double precision name is DLQRRV.
ScaLAPACK Interface
Generic:
CALL LQRRV (A0, X0 [, …])
Specific:
The specific interface names are S_LQRRV and D_LQRRV.
LQRRV
Chapter 1: Linear Systems
509
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The routine LQRRV computes the QR decomposition of a matrix A using blocked Householder transformations. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon
which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The standard algorithm is based
on the storage-efficient WY representation for products of Householder transformations. See Schreiber and
Van Loan (1989).
The routine LQRRV determines an orthogonal matrix Q and an upper triangular matrix R such that A = QR.
The QR factorization of a matrix A having NRA rows and NCA columns is as follows:
Initialize A1 ← A
For k = 1, min(NRA - 1, NCA)
Determine a Householder transformation for column k of Ak having the form
+N
, í IJ N ȝN ȝ7N
where uk has zeros in the first k - 1 positions and τk is a scalar.
Update
$N ĸ + N $Ní
7
7
$Ní í IJN ȝN $NíȝN
End k
Thus,
$S
+ S+ Sí ֥ + $
47 $
5
where p = min(NRA − 1, NCA). The matrix Q is not produced directly by LQRRV. The information needed to
construct the Householder transformations is saved instead. If the matrix Q is needed explicitly, QT can be
determined while the matrix is factored. No pivoting among the columns is done. The primary purpose of
LQRRV is to give the user a high-performance QR least-squares solver. It is intended for least-squares problems that are well-posed. For background, see Golub and Van Loan (1989, page 225). During the QR
factorization, the most time-consuming step is computing the matrix-vector update Ak ← HkAk−1. The routine LQRRV constructs “block” of NB Householder transformations in which the update is “rich” in matrix
multiplication. The product of NB Householder transformations are written in the form
+ N + N ֥ + NQEí
, <7< 7
where YNRA×NB is a lower trapezoidal matrix and TNB×NB is upper triangular. The optimal choice of the
block size parameter NB varies among computer systems. Users may want to change it from its default value
of 1.
LQRRV
Chapter 1: Linear Systems
510
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2RRV/DL2RRV. The reference is:
CALL L2RRV (NRA, NCA, NUMEXC, A, LDA, X, LDX, FACT, LDFACT, WK)
The additional arguments are as follows:
FACT — LDFACT × (NCA + NUMEXC) work array containing the Householder factorization of the
matrix on output. If the input data is not needed, A and FACT can share the same storage
locations.
LDFACT — Leading dimension of the array FACT exactly as specified in the dimension statement
of the calling program. (Input)
If A and FACT are sharing the same storage, then LDA = LDFACT is required.
WK — Work vector of length (NCA + NUMEXC + 1) * (NB + 1) . The default value is NB = 1. This
value can be reset. See item 3 below.
2.
3.
Informational errors
Type
Code
Description
4
1
The input matrix is singular.
Integer Options with Chapter 11 Options Manager
5
This option allows the user to reset the blocking factor used in computing the factorization. On
some computers, changing IVAL(*) to a value larger than 1 will result in greater efficiency. The
value IVAL(*) is the maximum value to use. (The software is specialized so that IVAL(*) is reset
to an “optimal” used value within routine L2RRV.) The user can control the blocking by resetting IVAL(*) to a smaller value than the default. Default values are IVAL(*) = 1, IMACH(5).
6
This option is the vector dimension where a shift is made from in-line level-2 loops to the use of
level-2 BLAS in forming the partial product of Householder transformations. Default value is
IVAL(*) = IMACH(5).
10
This option allows the user to control the factorization step. If the value is 1 the Householder
factorization will be computed. If the value is 2, the factorization will not be computed. In this
latter case the decomposition has already been computed. Default value is IVAL(*) = 1.
11
This option allows the user to control the solving steps. The rules for IVAL(*) are:
1.
Compute b ← QTb, and x ← R+b.
2.
Compute b ← QTb.
3.
Compute b ← Qb.
4.
Compute x ← R+b.
Default value is IVAL (*) = 1. Note that IVAL (*) = 2 or 3 may only be set when calling
L2RRV/DL2RRV.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
LQRRV
Chapter 1: Linear Systems
511
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved
upon output. The Householder factorization of the matrix is computed and used to solve the systems..
(Input)
X0 — MXLDX by MXCOLX local matrix containing the local portions of the distributed matrix X. X contains
the solution. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA, MXLDX, MXCOL, and MXCOLX can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the
ScaLAPACK Example below.
Examples
Example
Given a real m × k matrix B it is often necessary to compute the k least-squares solutions of the linear system
AX = B, where A is an m × n real matrix. When m > n the system is considered overdetermined. A solution with
a zero residual normally does not exist. Instead the minimization problem
PLQ ӝ $[ M í E Mӝ
[ MӇ5Q
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of full column
rank there exits a unique solution XLS that solves the above minimization problem. By using the routine
LQRRV, XLS is computed.
USE LQRRV_INT
USE WRRRN_INT
USE SGEMM_INT
!
INTEGER
PARAMETER
!
REAL
!
REAL
SAVE
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDX, NCA, NRA, NUMEXC
(NCA=3, NRA=5, NUMEXC=2, LDA=NRA, LDX=NCA)
SPECIFICATIONS FOR LOCAL VARIABLES
X(LDX,NUMEXC)
SPECIFICATIONS FOR SAVE VARIABLES
A(LDA,NCA+NUMEXC)
A
SPECIFICATIONS FOR SUBROUTINES
Set values for A and the
righthand sides.
A = (
(
(
(
(
1
1
1
1
1
2
4
6
8
10
4
16
36
64
100
|
7
| 21
| 43
| 73
| 111
10)
10)
9 )
10)
10)
DATA A/5*1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 4.0, 16.0, 36.0, 64.0, &
LQRRV
Chapter 1: Linear Systems
512
100.0, 7.0, 21.0, 43.0, 73.0, 111.0, 2*10., 9., 2*10./
!
!
!
!
QR factorization and solution
CALL LQRRV (A, X)
CALL WRRRN (’SOLUTIONS 1-2’, X)
Compute residuals and print
CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &
-1.E0, A(1:,(NCA+1):),LDA)
CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))
!
END
Output
1
2
3
SOLUTIONS 1-2
1
2
1.00
10.80
1.00
-0.43
1.00
0.04
1
2
3
4
5
RESIDUALS 1-2
1
2
0.0000
0.0857
0.0000 -0.3429
0.0000
0.5143
0.0000 -0.3429
0.0000
0.0857
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. Given a real m × k matrix B it is
often necessary to compute the k least-squares solutions of the linear system AX = B, where A is an m × n real
matrix. When m > n the system is considered overdetermined. A solution with a zero residual normally does
not exist. Instead the minimization problem
PLQ ӝ $[ M í E Mӝ
[ MӇ5Q
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of full column
rank there exits a unique solution XLS that solves the above minimization problem. By using the routine
LQRRV, XLS is computed. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRV_INT
USE SGEMM_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
LQRRV
Chapter 1: Linear Systems
513
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
LDA, LDX, NCA, NRA, NUMEXC, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA, MXLDX, MXCOLX
INTEGER
K
REAL, ALLOCATABLE ::
A(:,:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), X0(:)
PARAMETER (NRA=5, NCA=3, NUMEXC=2, LDA=NRA, LDX=NCA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA+NUMEXC), X(LDX, NUMEXC))
Set values for A and the righthand sides
A(1,:) = (/ 1.0, 2.0,
4.0,
7.0, 10.0/)
A(2,:) = (/ 1.0, 4.0, 16.0, 21.0, 10.0/)
A(3,:) = (/ 1.0, 6.0, 36.0, 43.0, 9.0/)
A(4,:) = (/ 1.0, 8.0, 64.0, 73.0, 10.0/)
A(5,:) = (/ 1.0, 10.0, 100.0, 111.0, 10.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA+NUMEXC, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(NRA, NCA+NUMEXC, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA+NUMEXC, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
K = MIN0(NRA, NCA)
Need to get dimensions of local x
separate since x's leading
dimension differs from A's
Get the array descriptor entities
MXLDX, AND MXCOLX
CALL SCALAPACK_GETDIM(K, NUMEXC, MP_MB, MP_NB, MXLDX, MXCOLX)
CALL DESCINIT (DESCX, K, NUMEXC, MP_NB, MP_NB, 0, 0, MP_ICTXT, &
MXLDX, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), X0(MXLDX,MXCOLX))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Solve the least squares problem
CALL LQRRV (A0, X0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)THEN
CALL WRRRN (’SOLUTIONS 1-2’, X)
Compute residuals and print
CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &
-1.E0, A(1:,(NCA+1):),LDA)
CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))
ENDIF
LQRRV
Chapter 1: Linear Systems
514
!
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
LQRRV
Chapter 1: Linear Systems
515
LSBRR
more...
Solves a linear least-squares problem with iterative refinement.
Required Arguments
A — Real NRA by NCA matrix containing the coefficient matrix of the least-squares system to be solved.
(Input)
B — Real vector of length NRA containing the right-hand side of the least-squares system. (Input)
X — Real vector of length NCA containing the solution vector with components corresponding to the columns not used set to zero. (Output)
Optional Arguments
NRA — Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
TOL — Real scalar containing the nonnegative tolerance used to determine the subset of columns of A to
be included in the solution. (Input)
If TOL is zero, a full complement of min(NRA, NCA) columns is used. See Comments.
Default: TOL = 0.0
RES — Real vector of length NRA containing the residual vector B - AX. (Output)
KBASIS — Integer scalar containing the number of columns used in the solution. (Output)
FORTRAN 90 Interface
Generic:
CALL LSBRR (A, B, X [, …])
Specific:
The specific interface names are S_LSBRR and D_LSBRR.
FORTRAN 77 Interface
Single:
CALL LSBRR (NRA, NCA, A, LDA, B, TOL, X, RES, KBASIS)
Double:
The double precision name is DLSBRR.
LSBRR
Chapter 1: Linear Systems
516
Description
Routine LSBRR solves the linear least-squares problem using iterative refinement. The iterative refinement
algorithm is due to Björck (1967, 1968). It is also described by Golub and Van Loan (1983, pages 182−183).
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2BRR/DL2BRR. The reference is:
CALL L2BRR (NRA, NCA, A, LDA, B, TOL, X, RES, KBASIS, QR, BRRUX,
IPVT, WK)
The additional arguments are as follows:
QR — Work vector of length NRA * NCA representing an NRA by NCA matrix that contains information from the QR factorization of A. See LQRRR for details.
BRRUX — Work vector of length NCA containing information about the orthogonal factor of the
QR factorization of A. See LQRRR for details.
IPVT — Integer work vector of length NCA containing the pivoting information for the QR factorization of A. See LQRRR for details.
WK — Work vector of length NRA + 2 * NCA − 1.
2.
3.
Informational error
Type
Code
Description
4
1
The data matrix is too ill-conditioned for iterative refinement to be effective.
Routine LSBRR calculates the QR decomposition with pivoting of a matrix A and tests the diagonal
elements against a user-supplied tolerance TOL. The first integer KBASIS = k is determined for which
∣rk+1,k+1 ∣ ≤ TOL * ∣r11∣
In effect, this condition implies that a set of columns with a condition number approximately bounded
by 1.0/TOL is used. Then, LQRSL performs a truncated fit of the first KBASIS columns of the permuted
A to an input vector B. The coefficient of this fit is unscrambled to correspond to the original columns
of A, and the coefficients corresponding to unused columns are set to zero. It may be helpful to scale
the rows and columns of A so that the error estimates in the elements of the scaled matrix are roughly
equal to TOL. The iterative refinement method of Björck is then applied to this factorization.
4.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2BRR the leading dimension of QR is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSBRR. Additional memory allocation for QR and option value restoration are
done automatically in LSBRR. Users directly calling L2BRR can allocate additional space for QR
and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies. There
is no requirement that users change existing applications that use LSBRR or L2BRR. Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSBRR temporarily replaces IVAL(2) by IVAL(1). The routine L2CRG computes the condition
number if IVAL(2) = 2. Otherwise L2CRG skips this computation. LSBRR restores the option.
Default values for the option are IVAL(*) = 1, 2.
LSBRR
Chapter 1: Linear Systems
517
Example
This example solves the linear least-squares problem with A, an 8 × 4 matrix. Note that the second and
fourth columns of A are identical. Routine LSBRR determines that there are three columns in the basis.
USE LSBRR_INT
USE UMACH_INT
USE WRRRN_INT
!
Declare variables
(NRA=8, NCA=4, LDA=NRA)
A(LDA,NCA), B(NRA), X(NCA), RES(NRA), TOL
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
Set values for A
A = (
(
(
(
(
(
(
(
1
1
1
1
1
1
1
1
5
4
7
3
1
8
3
4
15
17
14
18
15
11
9
10
5
4
7
3
1
8
3
4
)
)
)
)
)
)
)
)
DATA A/8*1, 5., 4., 7., 3., 1., 8., 3., 4., 15., 17., 14., &
18., 15., 11., 9., 10., 5., 4., 7., 3., 1., 8., 3., 4. /
!
!
!
Set values for B
DATA B/ 30., 31., 35., 29., 18., 35., 20., 22. /
!
!
!
Solve the least squares problem
TOL = 1.0E-4
CALL LSBRR (A, B, X, tol=tol, RES=RES, KBASIS=KBASIS)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ’KBASIS = ’, KBASIS
CALL WRRRN (’X’, X, 1, NCA, 1)
CALL WRRRN (’RES’, RES, 1, NRA, 1)
!
END
Output
KBASIS =
3
X
1
0.636
2
2.845
3
1.058
4
0.000
1
-0.733
2
0.996
3
-0.365
4
0.783
RES
5
-1.353
6
-0.036
7
1.306
8
-0.597
LSBRR
Chapter 1: Linear Systems
518
LCLSQ
Solves a linear least-squares problem with linear constraints.
Required Arguments
A — Matrix of dimension NRA by NCA containing the coefficients of the NRA least squares equations.
(Input)
B — Vector of length NRA containing the right-hand sides of the least squares equations. (Input)
C — Matrix of dimension NCON by NCA containing the coefficients of the NCON constraints. (Input)
If NCON = 0, C is not referenced.
BL — Vector of length NCON containing the lower limit of the general constraints. (Input)
If there is no lower limit on the I-th constraint, then BL(I) will not be referenced.
BU — Vector of length NCON containing the upper limit of the general constraints. (Input)
If there is no upper limit on the I-th constraint, then BU(I) will not be referenced. If there is no range
constraint, BL and BU can share the same storage locations.
IRTYPE — Vector of length NCON indicating the type of constraints exclusive of simple bounds, where
IRTYPE(I) = 0, 1, 2, 3 indicates .EQ., .LE., .GE., and range constraints respectively. (Input)
XLB — Vector of length NCA containing the lower bound on the variables. (Input)
If there is no lower bound on the I-th variable, then XLB(I) should be set to 1.0E30.
XUB — Vector of length NCA containing the upper bound on the variables. (Input)
If there is no upper bound on the I-th variable, then XUB(I) should be set to -1.0E30.
X — Vector of length NCA containing the approximate solution. (Output)
Optional Arguments
NRA — Number of least-squares equations. (Input)
Default: NRA = size (A,1).
NCA — Number of variables. (Input)
Default: NCA = size (A,2).
NCON — Number of constraints. (Input)
Default: NCON = size (C,1).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
LDA must be at least NRA.
Default: LDA = size (A,1).
LDC — Leading dimension of C exactly as specified in the dimension statement of the calling program.
(Input)
LDC must be at least NCON.
Default: LDC = size (C,1).
RES — Vector of length NRA containing the residuals B - AX of the least-squares equations at the approximate solution. (Output)
FORTRAN 90 Interface
Generic:
CALL LCLSQ (A, B, C, BL, BU, IRTYPE, XLB, XUB, X [, …])
LCLSQ
Chapter 1: Linear Systems
519
Specific:
The specific interface names are S_LCLSQ and D_LCLSQ.
FORTRAN 77 Interface
Single:
CALL LCLSQ (NRA, NCA, NCON, A, LDA, B, C, LDC, BL, BU, IRTYPE, XLB, XUB, X, RES)
Double:
The double precision name is DLCLSQ.
Description
The routine LCLSQ solves linear least-squares problems with linear constraints. These are systems of leastsquares equations of the form Ax ≅ b, subject to
bl ≤ C x ≤ bu
xl ≤ x ≤ xu
Here, A is the coefficient matrix of the least-squares equations, b is the right-hand side, and C is the coefficient
matrix of the constraints. The vectors bl, bu, xl and xu are the lower and upper bounds on the constraints and
the variables, respectively. The system is solved by defining dependent variables y ≡ Cx and then solving the
least squares system with the lower and upper bounds on x and y. The equation Cx − y = 0 is a set of equality
constraints. These constraints are realized by heavy weighting, i.e. a penalty method, Hanson, (1986, pages
826−834).
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2LSQ/DL2LSQ. The reference is:
CALL L2LSQ (NRA, NCA, NCON, A, LDA, B, C, LDC, BL, BU, IRTYPE, XLB, XUB, X, RES, WK, IWK)
The additional arguments are as follows:
WK — Real work vector of length
(NCON + MAXDIM) * (NCA + NCON + 1) + 10 * NCA + 9 * NCON + 3.
IWK — Integer work vector of length 3 * (NCON + NCA).
2.
3.
4.
Informational errors
Type
Code
Description
3
1
The rank determination tolerance is less than machine precision.
4
2
The bounds on the variables are inconsistent.
4
3
The constraint bounds are inconsistent.
4
4
Maximum number of iterations exceeded.
Integer Options with Chapter 11 Options Manager
13
Debug output flag. If more detailed output is desired, set this option to the value 1. Otherwise,
set it to 0. Default value is 0.
14
Maximum number of add/drop iterations. If the value of this option is zero, up to
5 * max(nra, nca) iterations will be allowed. Otherwise set this option to the desired iteration
limit. Default value is 0.
Floating Point Options with Chapter 11 Options Manager
LCLSQ
Chapter 1: Linear Systems
520
2
The value of this option is the relative rank determination tolerance to be used. Default value is
sqrt(AMACH (4)).
5
The value of this option is the absolute rank determination tolerance to be used. Default value is
sqrt(AMACH (4)).
Example
A linear least-squares problem with linear constraints is solved.
USE LCLSQ_INT
USE UMACH_INT
USE SNRM2_INT
!
!
Solve the following in the least squares sense:
!
3x1 + 2x2 + x3 = 3.3
!
4x1 + 2x2 + x3 = 2.3
!
2x1 + 2x2 + x3 = 1.3
!
x1 + x2 + x3 = 1.0
!
!
Subject to: x1 + x2 + x3 <= 1
!
0 <= x1 <= .5
!
0 <= x2 <= .5
!
0 <= x3 <= .5
!
! ---------------------------------------------------------------------!
Declaration of variables
!
INTEGER
NRA, NCA, MCON, LDA, LDC
PARAMETER
(NRA=4, NCA=3, MCON=1, LDC=MCON, LDA=NRA)
!
INTEGER
IRTYPE(MCON), NOUT
REAL
A(LDA,NCA), B(NRA), BC(MCON), C(LDC,NCA), RES(NRA), &
RESNRM, XSOL(NCA), XLB(NCA), XUB(NCA)
!
Data initialization!
DATA A/3.0E0, 4.0E0, 2.0E0, 1.0E0, 2.0E0, &
2.0E0, 2.0E0, 1.0E0, 1.0E0, 1.0E0, 1.0E0, 1.0E0/, &
B/3.3E0, 2.3E0, 1.3E0, 1.0E0/, &
C/3*1.0E0/, &
BC/1.0E0/, IRTYPE/1/, XLB/3*0.0E0/, XUB/3*.5E0/
!
!
Solve the bounded, constrained
!
least squares problem.
!
CALL LCLSQ (A, B, C, BC, BC, IRTYPE, XLB, XUB, XSOL, RES=res)
!
Compute the 2-norm of the residuals.
RESNRM = SNRM2 (NRA, RES, 1)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, 999) XSOL, RES, RESNRM
!
999 FORMAT (’ The solution is ’, 3F9.4, //, ’ The residuals ’, &
’evaluated at the solution are ’, /, 18X, 4F9.4, //, &
’ The norm of the residual vector is ’, F8.4)
!
LCLSQ
Chapter 1: Linear Systems
521
END
Output
The solution is
0.5000
0.3000
0.2000
The residuals evaluated at the solution are
-1.0000
0.5000
0.5000
The norm of the residual vector is
0.0000
1.2247
LCLSQ
Chapter 1: Linear Systems
522
LQRRR
more...
more...
Computes the QR decomposition, AP = QR, using Householder transformations.
Required Arguments
A — Real NRA by NCA matrix containing the matrix whose QR factorization is to be computed. (Input)
QR — Real NRA by NCA matrix containing information required for the QR factorization. (Output)
The upper trapezoidal part of QR contains the upper trapezoidal part of R with its diagonal elements
ordered in decreasing magnitude. The strict lower trapezoidal part of QR contains information to
recover the orthogonal matrix Q of the factorization. Arguments A and QR can occupy the same storage locations. In this case, A will not be preserved on output.
QRAUX — Real vector of length NCA containing information about the orthogonal part of the decomposition in the first min(NRA, NCA) position. (Output)
Optional Arguments
NRA — Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
PIVOT — Logical variable. (Input)
PIVOT = .TRUE. means column pivoting is enforced.
PIVOT = .FALSE. means column pivoting is not done.
Default: PIVOT = .TRUE.
IPVT — Integer vector of length NCA containing information that controls the final order of the columns of
the factored matrix A. (Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column. If IPVT(K) = 0, then the K-th
column of A is a free column. If IPVT(K) < 0, then the K-th column of A is a final column. See the Comments section below. On output, IPVT(K) contains the index of the column of A that has been
interchanged into the K-th column. This defines the permutation matrix P. The array IPVT is referenced only if PIVOT is equal to .TRUE.
Default: IPVT = 0.
LDQR — Leading dimension of QR exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDQR = size (QR,1).
CONORM — Real vector of length NCA containing the norms of the columns of the input matrix. (Output)
If this information is not needed, CONORM and QRAUX can share the same storage locations.
LQRRR
Chapter 1: Linear Systems
523
FORTRAN 90 Interface
Generic:
CALL LQRRR (A, QR, QRAUX [, …])
Specific:
The specific interface names are S_LQRRR and D_LQRRR.
FORTRAN 77 Interface
Single:
CALL LQRRR (NRA, NCA, A, LDA, PIVOT, IPVT, QR, LDQR, QRAUX, CONORM)
Double:
The double precision name is DLQRRR.
ScaLAPACK Interface
Generic:
CALL LQRRR (A0, QR0, QRAUX0 [, …])
Specific:
The specific interface names are S_LQRRR and D_LQRRR.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The routine LQRRR computes the QR decomposition of a matrix using Householder transformations. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual.
LQRRR determines an orthogonal matrix Q, a permutation matrix P, and an upper trapezoidal matrix R with
diagonal elements of nonincreasing magnitude, such that AP = QR. The Householder transformation for column k is of the form
XN X7N
,í S
N
for k = 1, 2, …, min(NRA, NCA), where u has zeros in the first k − 1 positions. The matrix Q is not produced
directly by LQRRR . Instead the information needed to reconstruct the Householder transformations is saved.
If the matrix Q is needed explicitly, the subroutine LQERR can be called after LQRRR. This routine accumulates Q from its factored form.
Before the decomposition is computed, initial columns are moved to the beginning of the array A and the
final columns to the end. Both initial and final columns are frozen in place during the computation. Only free
columns are pivoted. Pivoting, when requested, is done on the free columns of largest reduced norm.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2RRR/DL2RRR. The reference is:
CALL L2RRR (NRA, NCA, A, LDA, PIVOT, IPVT, QR, LDQR, QRAUX, CONORM, WORK)
The additional argument is
WORK — Work vector of length 2NCA − 1. Only NCA − 1 locations of WORK are referenced if
PIVOT = .FALSE. .
LQRRR
Chapter 1: Linear Systems
524
2.
LQRRR determines an orthogonal matrix Q, permutation matrix P, and an upper trapezoidal matrix R
with diagonal elements of nonincreasing magnitude, such that AP = QR. The Householder transformation for column k, k = 1, …, min(NRA, NCA) is of the form
7
, í Xí
N XX
where u has zeros in the first k − 1 positions. If the explicit matrix Q is needed, the user can call routine
LQERR after calling LQRRR. This routine accumulates Q from its factored form.
3.
Before the decomposition is computed, initial columns are moved to the beginning and the final columns to the end of the array A. Both initial and final columns are not moved during the computation.
Only free columns are moved. Pivoting, if requested, is done on the free columns of largest reduced
norm.
4.
When pivoting has been selected by having entries of IPVT initialized to zero, an estimate of the condition number of A can be obtained from the output by computing the magnitude of the number
QR(1, 1)/QR(K, K), where K = MIN(NRA, NCA). This estimate can be used to select the number of columns, KBASIS, used in the solution step computed with routine LQRSL.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix whose QR factorization is to be computed. (Input)
QR0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix QR. QR contains the information required for the QR factorization. (Output)
The upper trapezoidal part of QR contains the upper trapezoidal part of R with its diagonal elements
ordered in decreasing magnitude. The strict lower trapezoidal part of QR contains information to
recover the orthogonal matrix Q of the factorization. Arguments A and QR can occupy the same storage locations. In this case, A will not be preserved on output.
QRAUX0 — Real vector of length MXCOL containing the local portions of the distributed matrix QRAUX.
QRAUX contains information about the orthogonal part of the decomposition in the first MIN(NRA, NCA)
position. (Output)
IPVT0 — Integer vector of length MXLDB containing the local portions of the distributed vector IPVT.
IPVT contains the information that controls the final order of the columns of the factored matrix A.
(Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column. If IPVT(K) = 0, then the K-th
column of A is a free column. If IPVT(K) < 0, then the K-th column of A is a final column. See Comments.
On output, IPVT(K) contains the index of the column of A that has been interchanged into the K-th column. This defines the permutation matrix P. The array IPVT is referenced only if PIVOT is equal to
.TRUE.
Default: IPVT = 0.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA, MXLDB, and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the
ScaLAPACK Example below.
LQRRR
Chapter 1: Linear Systems
525
Examples
Example
In various statistical algorithms it is necessary to compute q = xT(AT A)-1x, where A is a rectangular matrix of
full column rank. By using the QR decomposition, q can be computed without forming ATA. Note that
AT A = (QRP-1)T(QRP-1) = P-T RT (QTQ)RP-1 = P RTRPT
since Q is orthogonal (QTQ = I) and P is a permutation matrix. Let
47 $3
5
5
where R1 is an upper triangular nonsingular matrix. Then
[7 $7 $
í
[
í7 í
[7 35í
5 3 [
í ӝ5í7
3 [ӝ
In the following program, first the vector t = P-1 x is computed. Then
t := R1-Tt
Finally,
q = ∥t∥2
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
!
INTEGER
PARAMETER
!
INTEGER
REAL
LOGICAL
REAL
Declare variables
LDA, LDQR, NCA, NRA
(NCA=3, NRA=4, LDA=NRA, LDQR=NRA)
SPECIFICATIONS FOR PARAMETERS
LDQ
(LDQ=NRA)
SPECIFICATIONS FOR LOCAL VARIABLES
IPVT(NCA), NOUT
CONORM(NCA), Q, QR(LDQR,NCA), QRAUX(NCA), T(NCA)
PIVOT
A(LDA,NCA), X(NCA)
!
!
!
!
!
!
!
!
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
Set values for X
X = (
1
2
3
)
DATA X/1.0, 2.0, 3.0/
LQRRR
Chapter 1: Linear Systems
526
!
!
!
!
!
!
QR factorization
PIVOT = .TRUE.
IPVT=0
CALL LQRRR (A, QR, QRAUX, pivot=pivot, IPVT=IPVT)
Set t = inv(P)*x
CALL PERMU (X, IPVT, T, IPATH=1)
Compute t = inv(trans(R))*t
CALL LSLRT (QR, T, T, IPATH=4)
Compute 2-norm of t, squared.
Q = SDOT(NCA,T,1,T,1)
Print result
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ’Q = ’, Q
!
END
Output
Q =
0.840624
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. In various statistical algorithms
it is necessary to compute q = xT(AT A)-1x, where A is a rectangular matrix of full column rank. By using the
QR decomposition, q can be computed without forming AT A. Note that
ATA = (QRP-1)T(QRP-1) = P-TRT(QTQ)RP-1 = P RTRPT
since Q is orthogonal (QTQ = I) and P is a permutation matrix. Let
47 $3
5
5
where R1 is an upper triangular nonsingular matrix. Then
[7 $7 $
í
[
í7 í
[7 35í
5 3 [
í ӝ5í7
3 [ӝ
In the following program, first the vector t = P-1 x is computed. Then
t := R1-Tt
Finally,
q = ∥t∥2
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
LQRRR
Chapter 1: Linear Systems
527
USE LQRRR_INT
USE PERMU_INT
USE LSLRT_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, LDQR, NCA, NRA, DESCA(9), DESCB(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, MXLDB, MXCOLB, NOUT
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
LOGICAL
PIVOT
REAL
Q
REAL, ALLOCATABLE ::
A(:,:), X(:), T(:)
REAL, ALLOCATABLE ::
A0(:,:), T0(:), QR0(:,:), QRAUX0(:)
REAL, (KIND(1E0))SDOT
PARAMETER (NRA=4, NCA=3, LDA=NRA, LDQR=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), X(NCA), T(NCA), IPVT(NCA))
Set values for A and the righthand side
A(1,:) = (/ 1.0, 2.0,
4.0/)
A(2,:) = (/ 1.0, 4.0, 16.0/)
A(3,:) = (/ 1.0, 6.0, 36.0/)
A(4,:) = (/ 1.0, 8.0, 64.0/)
!
X
= (/ 1.0,
2.0,
3.0/)
!
IPVT = 0
ENDIF
!
!
CALL
!
!
CALL
CALL
!
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDB, MXCOLB
SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
SCALAPACK_GETDIM(NCA, 1, MP_NB, 1, MXLDB, MXCOLB)
Set up the array descriptors
DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, &
CALL
INFO)
CALL DESCINIT(DESCL, 1, NCA, 1, MP_NB, 0, 0, MP_ICTXT, 1, INFO)
CALL DESCINIT(DESCB, NCA, 1, MP_NB, 1, 0, 0, MP_ICTXT, MXLDB, &
INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), QR0(MXLDA,MXCOL), QRAUX0(MXCOL), &
IPVT0(MXCOL), T0(MXLDB))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
PIVOT = .TRUE.
CALL SCALAPACK_MAP(IPVT, DESCL, IPVT0)
QR factorization
CALL LQRRR (A0, QR0, QRAUX0, PIVOT=PIVOT, IPVT=IPVT0)
Unmap the results from the distributed
LQRRR
Chapter 1: Linear Systems
528
!
!
!
!
!
!
!
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(IPVT0, DESCL, IPVT, NCA, .FALSE.)
IF(MP_RANK .EQ. 0) CALL PERMU (X, IPVT, T, IPATH=1)
CALL SCALAPACK_MAP(T, DESCB, T0)
CALL LSLRT (QR0, T0, T0, IPATH=4)
CALL SCALAPACK_UNMAP(T0, DESCB, T)
Print results.
Only Rank=0 has the solution.
IF(MP_RANK .EQ. 0)THEN
Q = SDOT(NCA, T, 1, T, 1)
CALL UMACH (2, NOUT)
WRITE (NOUT, *) ‘Q = ‘, Q
ENDIF
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
Q =
0.840624
LQRRR
Chapter 1: Linear Systems
529
LQERR
more...
more...
Accumulates the orthogonal matrix Q from its factored form given the QR factorization of a rectangular
matrix A.
Required Arguments
QR — Real NRQR by NCQR matrix containing the factored form of the matrix Q in the first min(NRQR, NCQR)
columns of the strict lower trapezoidal part of QR as output from subroutine LQRRR/DLQRRR. (Input)
QRAUX — Real vector of length NCQR containing information about the orthogonal part of the decomposition in the first min(NRQR, NCQR) position as output from routine LQRRR/DLQRRR. (Input)
Q — Real NRQR by NRQR matrix containing the accumulated orthogonal matrix Q; Q and QR can share the
same storage locations if QR is not needed. (Output)
Optional Arguments
NRQR — Number of rows in QR. (Input)
Default: NRQR = size (QR,1).
NCQR — Number of columns in QR. (Input)
Default: NCQR = size (QR,2).
LDQR — Leading dimension of QR exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDQR = size (QR,1).
LDQ — Leading dimension of Q exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDQ = size (Q,1).
FORTRAN 90 Interface
Generic:
CALL LQERR (QR, QRAUX, Q [, …])
Specific:
The specific interface names are S_LQERR and D_LQERR.
FORTRAN 77 Interface
Single:
CALL LQERR (NRQR, NCQR, QR, LDQR, QRAUX, Q, LDQ)
Double:
The double precision name is DLQERR.
LQERR
Chapter 1: Linear Systems
530
ScaLAPACK Interface
Generic:
CALL LQERR (QR0, QRAUX0, Q0 [, …])
Specific:
The specific interface names are S_LQERR and D_LQERR.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The routine LQERR accumulates the Householder transformations computed by IMSL routine LQRRR to produce the orthogonal matrix Q.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2ERR/DL2ERR. The reference is:
CALL L2ERR (NRQR, NCQR, QR, LDQR, QRAUX, Q, LDQ, WK)
The additional argument is
WK — Work vector of length 2 * NRQR.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
QR0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix QR. QR contains the factored form of the matrix Q in the first min(NRQR, NCQR) columns of the strict lower
trapezoidal part of QR as output from subroutine LQRRR/DLQRRR. (Input)
QRAUX0 — Real vector of length MXCOL containing the local portions of the distributed matrix QRAUX.
QRAUX contains the information about the orthogonal part of the decomposition in the first
min(NRA, NCA) positions as output from subroutine LQRRR/DLQRRR. (Input)
Q0 — MXLDA by MXLDA local matrix containing the local portions of the distributed matrix Q. Q contains
the accumulated orthogonal matrix ; Q and QR can share the same storage locations if QR is not needed.
(Output)
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA and MXCOL can be obtained through a call to SCALAPACK_GETDIM
(see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example
below.
LQERR
Chapter 1: Linear Systems
531
Examples
Example 1
In this example, the orthogonal matrix Q in the QR decomposition of a matrix A is computed. The product
X = QR is also computed. Note that X can be obtained from A by reordering the columns of A according to
IPVT.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
Declare variables
LDA, LDQ, LDQR, NCA, NRA
(NCA=3, NRA=4, LDA=NRA, LDQ=NRA, LDQR=NRA)
!
INTEGER
REAL
LOGICAL
!
!
!
!
!
!
!
!
IPVT(NCA), J
A(LDA,NCA), CONORM(NCA), Q(LDQ,NRA), QR(LDQR,NCA), &
QRAUX(NCA), R(NRA,NCA), X(NRA,NCA)
PIVOT
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
QR factorization
Set IPVT = 0 (all columns free)
IPVT = 0
PIVOT = .TRUE.
CALL LQRRR (A, QR, QRAUX, IPVT=IPVT, PIVOT=PIVOT)
Accumulate Q
CALL LQERR (QR, QRAUX, Q)
R is the upper trapezoidal part of QR
R = 0.0E0
DO 10 J=1, NCA
CALL SCOPY (J, QR(:,J), 1, R(:,J), 1)
10 CONTINUE
Compute X = Q*R
CALL MRRRR (Q, R, X)
Print results
CALL WRIRN (’IPVT’, IPVT, 1, NCA, 1)
CALL WRRRN (’Q’, Q)
CALL WRRRN (’R’, R)
CALL WRRRN (’X = Q*R’, X)
!
!
!
!
!
END
Output
1
IPVT
2
3
LQERR
Chapter 1: Linear Systems
532
3
2
1
Q
1
2
3
4
1
-0.0531
-0.2126
-0.4783
-0.8504
2
-0.5422
-0.6574
-0.3458
0.3928
3
0.8082
-0.2694
-0.4490
0.2694
1
2
3
4
1
-75.26
0.00
0.00
0.00
2
-10.63
-2.65
0.00
0.00
3
-1.59
-1.15
0.36
0.00
1
2
3
4
1
4.00
16.00
36.00
64.00
X = Q*R
2
2.00
4.00
6.00
8.00
3
1.00
1.00
1.00
1.00
4
-0.2236
0.6708
-0.6708
0.2236
R
ScaLAPACK Example
In this example, the orthogonal matrix Q in the QR decomposition of a matrix A is computed.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRR_INT
USE LQERR_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
!
!
Declare variables
INTEGER
LDA, LDQR, NCA, NRA, DESCA(9), DESCL(9), DESCQ(9)
INTEGER
INFO, MXCOL, MXLDA, LDQ
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
LOGICAL
PIVOT
REAL, ALLOCATABLE ::
A(:,:), QR(:,:), Q(:,:), QRAUX(:)
REAL, ALLOCATABLE ::
A0(:,:), QR0(:,:), Q0(:,:), QRAUX0(:)
PARAMETER (NRA=4, NCA=3, LDA=NRA, LDQR=NRA, LDQ=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(NRA,NCA), Q(NRA,NRA), QR(NRA,NCA), &
QRAUX(NCA), IPVT(NCA))
Set values for A and the righthand sides
A(1,:) = (/ 1.0, 2.0,
4.0/)
A(2,:) = (/ 1.0, 4.0, 16.0/)
A(3,:) = (/ 1.0, 6.0, 36.0/)
A(4,:) = (/ 1.0, 8.0, 64.0/)
LQERR
Chapter 1: Linear Systems
533
!
IPVT = 0
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .FALSE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, &
INFO)
CALL DESCINIT(DESCL, 1, NCA, 1, MP_NB, 0, 0, MP_ICTXT, 1, INFO)
CALL DESCINIT(DESCQ, NRA, NRA, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, &
INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), QR0(MXLDA,MXCOL), QRAUX0(MXCOL), &
IPVT0(MXCOL), Q0(MXLDA,MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
PIVOT = .TRUE.
CALL SCALAPACK_MAP(IPVT, DESCL, IPVT0)
QR factorization
CALL LQRRR (A0, QR0, QRAUX0, PIVOT=PIVOT, IPVT=IPVT0)
CALL LQERR (QR0, QRAUX0, Q0)
Unmap the results from the distributed
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(Q0, DESCQ, Q)
Print results.
Only Rank=0 has the solution, Q.
IF(MP_RANK .EQ. 0) CALL WRRRN (’Q’, Q)
Exit Scalapack usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
LQERR
Chapter 1: Linear Systems
534
LQRSL
more...
more...
Computes the coordinate transformation, projection, and complete the solution of the least-squares problem
Ax = b.
Required Arguments
KBASIS — Number of columns of the submatrix Ak of A. (Input)
The value KBASIS must not exceed min(NRA, NCA), where NCA is the number of columns in matrix A.
The value NCA is an argument to routine LQRRR. The value of KBASIS is normally NCA unless the
matrix is rank-deficient. The user must analyze the problem data and determine the value of KBASIS.
See Comments.
QR — NRA by NCA array containing information about the QR factorization of A as output from routine
LQRRR/DLQRRR. (Input)
QRAUX — Vector of length NCA containing information about the QR factorization of A as output from
routine LQRRR/DLQRRR. (Input)
B — Vector b of length NRA to be manipulated. (Input)
IPATH — Option parameter specifying what is to be computed. (Input)
The value IPATH has the decimal expansion IJKLM, such that:
I ≠ 0 means compute Qb;
J ≠ 0 means compute QTb;
K ≠ 0 means compute QTb and x;
L ≠ 0 means compute QTb and b − Ax;
M ≠ 0 means compute QTb and Ax.
For example, if the decimal number IPATH = 01101, then I = 0, J = 1, K = 1, L = 0, and M = 1.
Optional Arguments
NRA — Number of rows of matrix A. (Input)
Default: NRA = size (QR,1).
LDQR — Leading dimension of QR exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDQR = size (QR,1).
QB — Vector of length NRA containing Qb if requested in the option IPATH. (Output)
QTB — Vector of length NRA containing QTb if requested in the option IPATH. (Output)
LQRSL
Chapter 1: Linear Systems
535
X — Vector of length KBASIS containing the solution of the least-squares problem Akx = b, if this is
requested in the option IPATH. (Output)
If pivoting was requested in routine LQRRR/DLQRRR, then the J-th entry of X will be associated with
column IPVT(J) of the original matrix A. See Comments.
RES — Vector of length NRA containing the residuals (b - Ax) of the least-squares problem if requested in
the option IPATH. (Output)
This vector is the orthogonal projection of b onto the orthogonal complement of the column space of A.
AX — Vector of length NRA containing the least-squares approximation Ax if requested in the option
IPATH. (Output)
This vector is the orthogonal projection of b onto the column space of A.
FORTRAN 90 Interface
Generic:
CALL LQRSL (KBASIS, QR, QRAUX, B, IPATH [, …])
Specific:
The specific interface names are S_LQRSL and D_LQRSL.
FORTRAN 77 Interface
Single:
CALL LQRSL (NRA, KBASIS, QR, LDQR, QRAUX, B, IPATH, QB, QTB, X, RES, AX)
Double:
The double precision name is DLQRSL.
ScaLAPACK Interface
Generic:
CALL LQRSL (KBASIS, QR0, QRAUX0, B0, IPATH [, …])
Specific:
The specific interface names are S_LQRSL and D_LQRSL.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The underlying code of routine LQRSL is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
The most important use of LQRSL is for solving the least-squares problem Ax = b, with coefficient matrix A
and data vector b. This problem can be formulated, using the normal equations method, as AT Ax = ATb. Using
LQRRR the QR decomposition of A, AP = QR, is computed. Here P is a permutation matrix (P = P), Q is an
orthogonal matrix (Q = QT) and R is an upper trapezoidal matrix. The normal equations can then be written
as
(PRT)(QTQ)R(PTx) = (PRT)QT b
If ATA is nonsingular, then R is also nonsingular and the normal equations can be written as R(PTx) = QTb.
LQRSL can be used to compute QT b and then solve for PT x. Note that the permuted solution is returned.
The routine LQRSL can also be used to compute the least-squares residual, b - Ax. This is the projection of b
onto the orthogonal complement of the column space of A. It can also compute Qb, QTb and Ax, the orthogonal projection of x onto the column space of A.
LQRSL
Chapter 1: Linear Systems
536
Comments
1.
2.
Informational error
Type
Code
Description
4
1
Computation of the least-squares solution of AK * X = B is requested, but the
upper triangular matrix R from the QR factorization is singular.
This routine is designed to be used together with LQRRR. It assumes that LQRRR/DLQRR has been
called to get QR, QRAUX and IPVT. The submatrix Ak mentioned above is actually equal to
Ak = (A(IPVT(1)), A(IPVT(2)), …, A(IPVT (KBASIS))), where A(IPVT(I)) is the IPVT(I)-th column of
the original matrix.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
QR0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix QR. QR contains the factored form of the matrix Q in the first min(NRQR, NCQR) columns of the strict lower
trapezoidal part of QR as output from subroutine LQRRR/DLQRRR. (Input)
QRAUX0 — Real vector of length MXCOL containing the local portions of the distributed matrix QRAUX.
QRAUX contains the information about the orthogonal part of the decomposition in the first min(NRA,
NCA) positions as output from subroutine LQRRR/DLQRRR. (Input)
B0 — Real vector of length MXLDA containing the local portions of the distributed vector B. B contains the
vector to be manipulated. (Input)
QB0 — Real vector of length MXLDA containing the local portions of the distributed vector Qb if requested
in the option IPATH. (Output)
QTB0 — Real vector of length MXLDA containing the local portions of the distributed vector QTb if
requested in the option IPATH. (Output)
X0 — Real vector of length MXLDX containing the local portions of the distributed vector X. X contains the
solution of the least-squares problem Akx = b, if this is requested in the option IPATH. (Output)
If pivoting was requested in routine LQRRR/DLQRRR, then the J-th entry of X will be associated with
column IPVT(J) of the original matrix A. See Comments.
RES0 — Real vector of length MXLDA containing the local portions of the distributed vector RES. RES contains the residuals (b - Ax) of the least-squares problem if requested in the option IPATH. (Output)
This vector is the orthogonal projection of b onto the orthogonal complement of the column space of A.
AX0 — Real vector of length MXLDA containing the local portions of the distributed vector AX. AX contains
the least-squares approximation Ax if requested in the option IPATH. (Output)
This vector is the orthogonal projection of b onto the column space of A.
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA, MXLDX and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the
ScaLAPACK Example below.
LQRSL
Chapter 1: Linear Systems
537
Examples
Example 1
Consider the problem of finding the coefficients ci in
f(x) = c0 + c1x + c2x2
given data at xi = 2i, i = 1, 2, 3, 4, using the method of least squares. The row of the matrix A contains the
value of 1, xi and xi2 at the data points. The vector b contains the data. The routine LQRRR is used to compute
the QR decomposition of A. Then LQRSL is then used to solve the least-squares problem and compute the
residual vector.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
LOGICAL
Declare variables
(NRA=4, NCA=3, KBASIS=3, LDA=NRA, LDQR=NRA)
IPVT(NCA)
A(LDA,NCA), QR(LDQR,NCA), QRAUX(NCA), CONORM(NCA), &
X(KBASIS), QB(1), QTB(NRA), RES(NRA), &
AX(1), B(NRA)
PIVOT
!
!
!
!
!
!
!
!
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
Set values for B
DATA B/ 16.99,
!
!
!
!
57.01,
B = ( 16.99 57.01
120.99, 209.01 /
120.99
209.01 )
QR factorization
PIVOT = .TRUE.
IPVT = 0
CALL LQRRR (A, QR, QRAUX, PIVOT=PIVOT, IPVT=IPVT)
Solve the least squares problem
IPATH = 00110
CALL LQRSL (KBASIS, QR, QRAUX, B, IPATH, X=X, RES=RES)
Print results
CALL WRIRN (’IPVT’, IPVT, 1, NCA, 1)
CALL WRRRN (’X’, X, 1, KBASIS, 1)
CALL WRRRN (’RES’, RES, 1, NRA, 1)
!
END
LQRSL
Chapter 1: Linear Systems
538
Output
1
3
IPVT
2
2
3
1
X
1
3.000
2
2.002
3
0.990
RES
1
-0.00400
2
0.01200
3
-0.01200
4
0.00400
Note that since IPVT is (3, 2, 1) the array X contains the solution coefficients ci in reverse order.
ScaLAPACK Example
The previous example is repeated here as a distributed example. Consider the problem of finding the coefficients ci in
f(x) = c0 + c1x + c2x2
given data at xi = 2i, i = 1, 2, 3, 4, using the method of least squares. The row of the matrix A contains the
value of 1, xi and xi2 at the data points. The vector b contains the data. The routine LQRRR is used to compute
the QR decomposition of A. Then LQRSL is then used to solve the least-squares problem and compute the
residual vector. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to
map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRR_INT
USE LQRSL_INT
USE WRIRN_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
KBASIS, LDA, LDQR, NCA, NRA, DESCA(9), DESCL(9), &
DESCX(9), DESCB(9)
INTEGER
INFO, MXCOL, MXCOLX, MXLDA, MXLDX, LDQ, IPATH
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:), QR(:,:), QRAUX(:), X(:), &
RES(:)
REAL, ALLOCATABLE ::
A0(:,:), QR0(:,:), QRAUX0(:), X0(:), &
RES0(:), B0(:), QTB0(:)
LOGICAL
PIVOT
PARAMETER (NRA=4, NCA=3, LDA=NRA, LDQR=NRA, KBASIS=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), B(NRA), QR(LDQR,NCA), &
QRAUX(NCA), IPVT(NCA), X(NCA), RES(NRA))
INTEGER
!
LQRSL
Chapter 1: Linear Systems
539
!
(/
(/
(/
(/
1.0,
1.0,
1.0,
1.0,
2.0,
4.0,
6.0,
8.0,
Set values for A and the righthand sides
4.0/)
16.0/)
36.0/)
64.0/)
A(1,:)
A(2,:)
A(3,:)
A(4,:)
=
=
=
=
B
= (/ 16.99, 57.01, 120.99, 209.01 /)
!
!
IPVT = 0
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(KBASIS, 1, MP_NB, 1, MXLDX, MXCOLX)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA,INFO)
CALL DESCINIT(DESCL, 1, NCA, 1, MP_NB, 0, 0, MP_ICTXT, 1, INFO)
CALL DESCINIT(DESCX, KBASIS, 1, MP_NB, 1, 0, 0, MP_ICTXT, MXLDX, INFO)
CALL DESCINIT(DESCB, NRA, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), QR0(MXLDA,MXCOL), QRAUX0(MXCOL), &
IPVT0(MXCOL), B0(MXLDA), X0(MXLDX), RES0(MXLDA), QTB0(MXLDA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
CALL SCALAPACK_MAP(B, DESCB, B0)
PIVOT = .TRUE.
CALL SCALAPACK_MAP(IPVT, DESCL, IPVT0)
QR factorization
CALL LQRRR (A0, QR0, QRAUX0, PIVOT=PIVOT, IPVT=IPVT0)
IPATH = 00110
CALL LQRSL (KBASIS, QR0, QRAUX0, B0, IPATH, QTB=QTB0, X=X0, RES=RES0)
Unmap the results from the distributed
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(IPVT0, DESCL, IPVT, NCA, .FALSE.)
CALL SCALAPACK_UNMAP(X0, DESCX, X)
CALL SCALAPACK_UNMAP(RES0, DESCB, RES)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0) THEN
CALL WRIRN (’IPVT’, IPVT, 1, NCA, 1)
CALL WRRRN (’X’, X, 1, KBASIS, 1)
CALL WRRRN (’RES’, RES, 1, NRA, 1)
ENDIF
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
LQRSL
Chapter 1: Linear Systems
540
Output
1
3
IPVT
2
2
3
1
X
1
3.000
2
2.002
3
0.990
RES
1
-0.00400
2
0.01200
3
-0.01200
4
0.00400
Note that since IPVT is (3, 2, 1) the array X contains the solution coefficients ci in reverse order.
LQRSL
Chapter 1: Linear Systems
541
LUPQR
Computes an updated QR factorization after the rank-one matrix αxyT is added.
Required Arguments
ALPHA — Scalar determining the rank-one update to be added. (Input)
W — Vector of length NROW determining the rank-one matrix to be added. (Input)
The updated matrix is A + αxyT. If I = 0 then W contains the vector x. If I = 1 then W contains the vector QTx.
Y — Vector of length NCOL determining the rank-one matrix to be added. (Input)
R — Matrix of order NROW by NCOL containing the R matrix from the QR factorization. (Input)
Only the upper trapezoidal part of R is referenced.
IPATH — Flag used to control the computation of the QR update. (Input)
IPATH has the decimal expansion IJ such that: I = 0 means W contains the vector x.
I= 1 means W contains the vector QTx.
J = 0 means do not update the matrix Q. J = 1 means update the matrix Q. For example, if IPATH = 10
then, I = 1 and J = 0.
RNEW — Matrix of order NROW by NCOL containing the updated R matrix in the QR factorization. (Output)
Only the upper trapezoidal part of RNEW is updated. R and RNEW may be the same.
Optional Arguments
NROW — Number of rows in the matrix A = Q * R. (Input)
Default: NROW = size (W,1).
NCOL — Number of columns in the matrix A = Q * R. (Input)
Default: NCOL = size (Y,1).
Q — Matrix of order NROW containing the Q matrix from the QR factorization. (Input)
Ignored if IPATH = 0.
Default: Q is 1x1 and un-initialized.
LDQ — Leading dimension of Q exactly as specified in the dimension statement of the calling program.
(Input)
Ignored if IPATH = 0.
Default: LDQ = size (Q,1).
LDR — Leading dimension of R exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDR = size (R,1).
QNEW — Matrix of order NROW containing the updated Q matrix in the QR factorization. (Output)
Ignored if J = 0. See IPATH for a definition of J.
LDQNEW — Leading dimension of QNEW exactly as specified in the dimension statement of the calling
program. (Input)
Ignored if J = 0. See IPATH for a definition of J.
Default: LDQNEW = size (QNEW,1).
LUPQR
Chapter 1: Linear Systems
542
LDRNEW — Leading dimension of RNEW exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDRNEW = size (RNEW,1).
FORTRAN 90 Interface
Generic:
CALL LUPQR (ALPHA, W, Y, R, IPATH, RNEW [, …])
Specific:
The specific interface names are S_LUPQR and D_LUPQR.
FORTRAN 77 Interface
Single:
CALL LUPQR (NROW, NCOL, ALPHA, W, Y, Q, LDQ, R, LDR, IPATH, QNEW, LDQNEW, RNEW,
LDRNEW)
Double:
The double precision name is DLUPQR.
Description
Let A be an m × n matrix and let A = QR be its QR decomposition. (In the program, m is called NROW and n is
called NCOL) Then
A + αxyT = QR + αxyT = Q(R + αQTxyT) = Q(R + αwyT)
where w = QT x. An orthogonal transformation J can be constructed, using a sequence of m − 1 Givens rota-
tions, such that Jw = ωe1, where ω = ±∥w∥2 and e1 = (1, 0, …, 0)T. Then
A + αxyT = (QJT)(JR + αωe1yT)
Since JR is an upper Hessenberg matrix, H = JR + αωe1yT is also an upper Hessenberg matrix. Again using
m − 1 Givens rotations, an orthogonal transformation G can be constructed such that GH is an upper triangular matrix. Then
$ Į[\7
aa
a
45 ZKHUH 4
4- 7 *7
is orthogonal and
a
5
*+
is upper triangular.
If the last k components of w are zero, then the number of Givens rotations needed to construct J or G is
m − k − 1 instead of m − 1.
For further information, see Dennis and Schnabel (1983, pages 55-58 and 311-313), or Golub and Van Loan
(1983, pages 437− 439).
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2PQR/DL2PQR. The reference is:
LUPQR
Chapter 1: Linear Systems
543
CALL L2PQR (NROW, NCOL, ALPHA, W, Y, Q, LDQ, R, LDR, IPATH, QNEW, LDQNEW, RNEW, LDRNEW, Z,
WORK)
The additional arguments are as follows:
Z — Work vector of length NROW.
WORK — Work vector of length MIN(NROW − 1, NCOL).
Example
The QR factorization of A is found. It is then used to find the QR factorization of A + xyT. Since pivoting is
used, the QR factorization routine finds AP = QR, where P is a permutation matrix determined by IPVT. We
compute
$3 Į[\7
$ Į[ 3\
7
3
aa
45
The IMSL routine PERMU (see Utilities) is used to compute Py. As a check
aa
45
is computed and printed. It can also be obtained from A + xyT by permuting its columns using the order
given by IPVT.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
Declare variables
LDA, LDAQR, LDQ, LDQNEW, LDQR, LDR, LDRNEW, NCOL, NROW
(NCOL=3, NROW=4, LDA=NROW, LDAQR=NROW, LDQ=NROW, &
LDQNEW=NROW, LDQR=NROW, LDR=NROW, LDRNEW=NROW)
!
INTEGER
REAL
LOGICAL
INTRINSIC
IPATH, IPVT(NCOL), J, MIN0
A(LDA,NCOL), ALPHA, AQR(LDAQR,NCOL), CONORM(NCOL), &
Q(LDQ,NROW), QNEW(LDQNEW,NROW), QR(LDQR,NCOL), &
QRAUX(NCOL), R(LDR,NCOL), RNEW(LDRNEW,NCOL), W(NROW), &
Y(NCOL)
PIVOT
MIN0
!
!
!
!
!
!
!
!
!
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
Set values for W and Y
DATA W/1., 2., 3., 4./
DATA Y/3., 2., 1./
!
!
!
QR factorization
Set IPVT = 0 (all columns free)
IPVT = 0
PIVOT = .TRUE.
LUPQR
Chapter 1: Linear Systems
544
CALL LQRRR (A, QR, QRAUX, IPVT=IPVT, PIVOT=PIVOT)
Accumulate Q
CALL LQERR (QR, QRAUX, Q)
Permute Y
CALL PERMU (Y, IPVT, Y)
R is the upper trapezoidal part of QR
R = 0.0E0
DO 10 J=1, NCOL
CALL SCOPY (MIN0(J,NROW), QR(:,J), 1, R(:,J), 1)
10 CONTINUE
Update Q and R
ALPHA = 1.0
IPATH = 01
CALL LUPQR (ALPHA, W, Y, R, IPATH, RNEW, Q=Q, QNEW=QNEW)
Compute AQR = Q*R
CALL MRRRR (QNEW, RNEW, AQR)
Print results
CALL WRIRN (’IPVT’, IPVT, 1, NCOL,1)
CALL WRRRN (’QNEW’, QNEW)
CALL WRRRN (’RNEW’, RNEW)
CALL WRRRN (’QNEW*RNEW’, AQR)
END
!
!
!
!
!
!
Output
1
3
IPVT
2
2
3
1
QNEW
1
2
3
4
1
-0.0620
-0.2234
-0.4840
-0.8438
1
2
3
4
1
-80.59
0.00
0.00
0.00
2
-0.5412
-0.6539
-0.3379
0.4067
3
0.8082
-0.2694
-0.4490
0.2694
4
-0.2236
0.6708
-0.6708
0.2236
RNEW
1
2
3
4
2
-21.34
-4.94
0.00
0.00
QNEW*RNEW
1
2
5.00
4.00
18.00
8.00
39.00
12.00
68.00
16.00
3
-17.62
-4.83
0.36
0.00
3
4.00
7.00
10.00
13.00
LUPQR
Chapter 1: Linear Systems
545
LCHRG
Computes the Cholesky decomposition of a symmetric positive definite matrix with optional column
pivoting.
Required Arguments
A — N by N symmetric positive definite matrix to be decomposed. (Input)
Only the upper triangle of A is referenced.
FACT — N by N matrix containing the Cholesky factor of the permuted matrix in its upper triangle. (Output)
If A is not needed, A and FACT can share the same storage locations.
Optional Arguments
N — Order of the matrix A. (Input)
Default: N = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
PIVOT — Logical variable. (Input)
PIVOT = .TRUE. means column pivoting is done. PIVOT = .FALSE. means no pivoting is done.
Default: PIVOT = .TRUE.
IPVT — Integer vector of length N containing information that controls the selection of the pivot columns.
(Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column; if IPVT(K) = 0, then the K-th
column of A is a free column; if IPVT(K) < 0, then the K-th column of A is a final column. See Comments. On output, IPVT(K) contains the index of the diagonal element of A that was moved into the Kth position. IPVT is only referenced when PIVOT is equal to .TRUE..
LDFACT — Leading dimension of FACT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
CALL LCHRG (A, FACT [, …])
Specific:
The specific interface names are S_LCHRG and D_LCHRG.
FORTRAN 77 Interface
Single:
CALL LCHRG (N, A, LDA, PIVOT, IPVT, FACT, LDFACT)
Double:
The double precision name is DLCHRG.
LCHRG
Chapter 1: Linear Systems
546
Description
Routine LCHRG is based on the LINPACK routine SCHDC; see Dongarra et al. (1979).
Before the decomposition is computed, initial elements are moved to the leading part of A and final elements
to the trailing part of A. During the decomposition only rows and columns corresponding to the free elements are moved. The result of the decomposition is an upper triangular matrix R and a permutation matrix
P that satisfy PT AP = RT R, where P is represented by IPVT.
Comments
1.
Informational error
Type
Code
Description
4
1
The input matrix is not positive definite.
2.
Before the decomposition is computed, initial elements are moved to the leading part of A and final
elements to the trailing part of A. During the decomposition only rows and columns corresponding to
the free elements are moved. The result of the decomposition is an upper triangular matrix R and a
permutation matrix P that satisfy PT AP = RT R, where P is represented by IPVT.
3.
LCHRG can be used together with subroutines PERMU and LSLDS to solve the positive definite linear
system AX = B with the solution X overwriting the right-hand side B as follows:
CALL
CALL
CALL
CALL
CALL
ISET
LCHRG
PERMU
LSLDS
PERMU
(N, 0, IPVT, 1)
(A, FACT, N, LDA,.TRUE, IPVT, LDFACT)
(B, IPVT, B, N, 1)
(FACT, B, B, N, LDFACT)
(B, IPVT, B, N, 2)
Example
Routine LCHRG can be used together with the IMSL routines PERMU (see Chapter 11) and LFSDS to solve a
positive definite linear system Ax = b. Since A = PRT RP, the system Ax = b is equivalent to RT R(Px) = Pb.
LFSDS is used to solve RT Ry = Pb for y. The routine PERMU is used to compute both Pb and x = Py.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
LOGICAL
!
!
!
!
!
!
!
!
!
Declare variables
(N=3, LDA=N, LDFACT=N)
IPVT(N)
A(LDA,N), FACT(LDFACT,N), B(N), X(N)
PIVOT
Set values for A and B
A = (
(
(
1
-3
2
-3
10
-5
2
-5
6
)
)
)
B = (
27
-78
64
)
DATA A/1.,-3.,2.,-3.,10.,-5.,2.,-5.,6./
LCHRG
Chapter 1: Linear Systems
547
DATA B/27.,-78.,64./
!
Pivot using all columns
PIVOT = .TRUE.
IPVT = 0
!
!
!
!
!
Compute Cholesky factorization
CALL LCHRG (A, FACT, PIVOT=PIVOT, IPVT=IPVT)
Permute B and store in X
CALL PERMU (B, IPVT, X, IPATH=1)
Solve for X
CALL LFSDS (FACT, X, X)
Inverse permutation
CALL PERMU (X, IPVT, X, IPATH=2)
Print X
CALL WRRRN (’X’, X, 1, N, 1)
!
END
Output
X
1
1.000
2
-4.000
3
7.000
LCHRG
Chapter 1: Linear Systems
548
LUPCH
Updates the RT R Cholesky factorization of a real symmetric positive definite matrix after a rank-one matrix
is added.
Required Arguments
R — N by N upper triangular matrix containing the upper triangular factor to be updated. (Input)
Only the upper triangle of R is referenced.
X — Vector of length N determining the rank-one matrix to be added to the factorization RT R. (Input)
RNEW — N by N upper triangular matrix containing the updated triangular factor of RT R + XXT. (Output)
Only the upper triangle of RNEW is referenced. If R is not needed, R and RNEW can share the same storage locations.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (R,2).
LDR — Leading dimension of R exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDR = size (R,1).
LDRNEW — Leading dimension of RNEW exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDRNEW = size (RNEW,1).
CS — Vector of length N containing the cosines of the rotations. (Output)
SN — Vector of length N containing the sines of the rotations. (Output)
FORTRAN 90 Interface
Generic:
CALL LUPCH (R, X, RNEW [, …])
Specific:
The specific interface names are S_LUPCH and D_LUPCH.
FORTRAN 77 Interface
Single:
CALL LUPCH (N, R, LDR, X, RNEW, LDRNEW, CS, SN)
Double:
The double precision name is DLUPCH.
Description
The routine LUPCH is based on the LINPACK routine SCHUD; see Dongarra et al. (1979).
The Cholesky factorization of a matrix is A = RT R, where R is an upper triangular matrix. Given this factorization, LUPCH computes the factorization
LUPCH
Chapter 1: Linear Systems
549
$ [[7
a7 a
5 5
In the program
a
5
is called RNEW.
LUPCH determines an orthogonal matrix U as the product GN…G1 of Givens rotations, such that
8
5
[7
a
5
By multiplying this equation by its transpose, and noting that UT U = I, the desired result
57 5 [[7
a7 a
5 5
is obtained.
Each Givens rotation, Gi, is chosen to zero out an element in xT. The matrix
Gi is (N + 1) × (N + 1) and has the form
*L
, Lí FL
, 1íL
íVL VL
FL
Where Ik is the identity matrix of order k and ci = cosθi = CS(I), si = sinθi = SN(I) for some θi.
Example
A linear system Az = b is solved using the Cholesky factorization of A. This factorization is then updated and
the system (A + xxT) z = b is solved using this updated factorization.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), FACT(LDFACT,LDFACT), FACNEW(LDFACT,LDFACT), &
X(N), B(N), CS(N), SN(N), Z(N)
Set values for A
A = ( 1.0 -3.0
( -3.0 10.0
( 2.0 -5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
!
!
Set values for X and B
LUPCH
Chapter 1: Linear Systems
550
DATA X/3.0, 2.0, 1.0/
DATA B/53.0, 20.0, 31.0/
!
Factor the matrix A
CALL LFTDS (A, FACT)
!
Solve the original system
CALL LFSDS (FACT, B, Z)
!
Print the results
CALL WRRRN (’FACT’, FACT, ITRING=1)
CALL WRRRN (’Z’, Z, 1, N, 1)
Update the factorization
CALL LUPCH (FACT, X, FACNEW)
Solve the updated system
CALL LFSDS (FACNEW, B, Z)
Print the results
CALL WRRRN (’FACNEW’, FACNEW, ITRING=1)
CALL WRRRN (’Z’, Z, 1, N, 1)
!
!
!
!
END
Output
1
2
3
FACT
1
2
1.000 -3.000
1.000
3
2.000
1.000
1.000
Z
1
1860.0
1
2
3
2
433.0
3
-254.0
FACNEW
1
2
3.162
0.949
3.619
3
1.581
-1.243
-1.719
Z
1
4.000
2
1.000
3
2.000
LUPCH
Chapter 1: Linear Systems
551
LDNCH
more...
Downdates the RT R Cholesky factorization of a real symmetric positive definite matrix after a rank-one
matrix is removed.
Required Arguments
R — N by N upper triangular matrix containing the upper triangular factor to be downdated. (Input)
Only the upper triangle of R is referenced.
X — Vector of length N determining the rank-one matrix to be subtracted from the factorization RT R.
(Input)
RNEW — N by N upper triangular matrix containing the downdated triangular factor of RT R − X XT.
(Output)
Only the upper triangle of RNEW is referenced. If R is not needed, R and RNEW can share the same storage locations.
Optional Arguments
N — Order of the matrix. (Input)
Default: N = size (R,2).
LDR — Leading dimension of R exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDR = size (R,1).
LDRNEW — Leading dimension of RNEW exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDRNEW = size (RNEW,1).
CS — Vector of length N containing the cosines of the rotations. (Output)
SN — Vector of length N containing the sines of the rotations. (Output)
FORTRAN 90 Interface
Generic:
CALL LDNCH (R, X, RNEW [, …])
Specific:
The specific interface names are S_LDNCH and D_LDNCH.
FORTRAN 77 Interface
Single:
CALL LDNCH (N, R, LDR, X, RNEW, LDRNEW, CS, SN)
Double:
The double precision name is DLDNCH.
LDNCH
Chapter 1: Linear Systems
552
Description
The routine LDNCH is based on the LINPACK routine SCHDD; see Dongarra et al. (1979).
The Cholesky factorization of a matrix is A = RT R, where R is an upper triangular matrix. Given this factorization, LDNCH computes the factorization
$ í [[7
a7 a
5 5
In the program
a
5
is called RNEW. This is not always possible, since A - xxT may not be positive definite.
LDNCH determines an orthogonal matrix U as the product GN…G1of Givens rotations, such that
8
a
5
5
[7
By multiplying this equation by its transpose and noting that UT U = I, the desired result
57 5 í [[7
a7 a
5 5
is obtained.
Let a be the solution of the linear system RT a = x and let
Į
í ӝDӝ
The Givens rotations, Gi, are chosen such that
D
* ֥ * 1 Į
The Gi are (N + 1) × (N + 1) matrices of the form
*L
, Lí
FL íVL
, 1íL VL FL
where Ik is the identity matrix of order k; and ci = cosθi = CS(I), si = sinθi = SN(I) for some θi.
The Givens rotations are then used to form
LDNCH
Chapter 1: Linear Systems
553
5
a
5* ֥ *1
a
5
[a7
The matrix
a
5
is upper triangular and
[a
[
because
[
57
D
Į
D
57 8 78 Į
a7
5 [a
[a
Comments
Informational error
Type
Code
Description
4
1
RTR − X XT is not positive definite. R cannot be downdated.
Example
A linear system Az = b is solved using the Cholesky factorization of A. This factorization is then downdated,
and the system (A − xxT)z = b is solved using this downdated factorization.
USE
USE
USE
USE
LDNCH_INT
LFTDS_INT
LFSDS_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), FACT(LDFACT,LDFACT), FACNEW(LDFACT,LDFACT), &
X(N), B(N), CS(N), SN(N), Z(N)
!
!
!
!
!
!
Set values for A
A = ( 10.0
3.0
( 3.0 14.0
( 5.0 -3.0
5.0)
-3.0)
7.0)
DATA A/10.0, 3.0, 5.0, 3.0, 14.0, -3.0, 5.0, -3.0, 7.0/
!
!
Set values for X and B
DATA X/3.0, 2.0, 1.0/
DATA B/53.0, 20.0, 31.0/
!
Factor the matrix A
CALL LFTDS (A, FACT)
!
Solve the original system
LDNCH
Chapter 1: Linear Systems
554
CALL LFSDS (FACT, B, Z)
!
Print the results
CALL WRRRN (’FACT’, FACT, ITRING=1)
CALL WRRRN (’Z’, Z, 1, N, 1)
Downdate the factorization
CALL LDNCH (FACT, X, FACNEW)
Solve the updated system
CALL LFSDS (FACNEW, B, Z)
Print the results
CALL WRRRN (’FACNEW’, FACNEW, ITRING=1)
CALL WRRRN (’Z’, Z, 1, N, 1)
!
!
!
!
END
Output
FACT
1
2
3
1
3.162
2
0.949
3.619
3
1.581
-1.243
1.719
Z
1
4.000
1
2
3
1
1.000
2
1.000
FACNEW
2
-3.000
1.000
3
2.000
3
2.000
1.000
1.000
Z
1
1859.9
2
433.0
3
-254.0
LDNCH
Chapter 1: Linear Systems
555
LSVRR
more...
more...
Computes the singular value decomposition of a real matrix.
Required Arguments
A — NRA by NCA matrix whose singular value decomposition is to be computed. (Input)
IPATH — Flag used to control the computation of the singular vectors. (Input)
IPATH has the decimal expansion IJ such that:
I = 0 means do not compute the left singular vectors.
I = 1 means return the NRA left singular vectors in U.
NOTE: This option is not available for the ScaLAPACK interface. If this option is chosen for
ScaLAPACK usage, the min(NRA, NCA) left singular vectors will be returned.
I = 2 means return only the min(NRA, NCA) left singular vectors in U.
J = 0 means do not compute the right singular vectors.
J = 1 means return the right singular vectors in V.
NOTE: If this option is chosen for ScaLAPACK usage, the min(NRA, NCA) right singular vectors
will be returned.
For example, IPATH = 20 means I = 2 and J = 0.
S — Vector of length min(NRA + 1, NCA) containing the singular values of A in descending order of magnitude in the first min(NRA, NCA) positions. (Output)
Optional Arguments
NRA — Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
TOL — Scalar containing the tolerance used to determine when a singular value is negligible. (Input)
If TOL is positive, then a singular value σi considered negligible if σi ≤ TOL . If TOL is negative, then a
singular value σi considered negligible if σi ≤ ∣TOL∣ * ∥A∥∞. In this case, |TOL| generally contains an
estimate of the level of the relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK — Scalar containing an estimate of the rank of A. (Output)
LSVRR
Chapter 1: Linear Systems
556
U — NRA by NCU matrix containing the left singular vectors of A. (Output)
NCU must be equal to NRA if I is equal to 1. NCU must be equal to min(NRA, NCA) if I is equal to 2. U will
not be referenced if I is equal to zero. If NRA is less than or equal to NCU, then U can share the same
storage locations as A. See Comments.
LDU — Leading dimension of U exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDU = size (U,1).
V — NCA by NCA matrix containing the right singular vectors of A. (Output)
V will not be referenced if J is equal to zero. V can share the same storage location as A, however, U and
V cannot both coincide with A simultaneously.
LDV — Leading dimension of V exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDV = size (V,1).
FORTRAN 90 Interface
Generic:
CALL LSVRR (A, IPATH, S [ , …])
Specific:
The specific interface names are S_LSVRR and D_LSVRR.
FORTRAN 77 Interface
Single:
CALL LSVRR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV)
Double:
The double precision name is DLSVRR.
ScaLAPACK Interface
Generic:
CALL LSVRR (A0, IPATH, S [, …])
Specific:
The specific interface names are S_LSVRR and D_LSVRR.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The underlying code of routine LSVRR is based on either LINPACK, LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Let n = NRA (the number of rows in A) and let p = NCA (the number of columns in A). For any n × p matrix A,
there exists an n × n orthogonal matrix U and a p × p orthogonal matrix V such that
7
8 $9
Ȉ
Ȉ
LI Q • S
LI Q ” S
where ∑ = diag(σ1, …, σm), and m = min(n, p). The scalars σ1 ≥ σ2 ≥ … ≥ σm ≥ 0 are called the singular values of A. The columns of U are called the left singular vectors of A. The columns of V are called the right
singular vectors of A.
LSVRR
Chapter 1: Linear Systems
557
The estimated rank of A is the number of σk that is larger than a tolerance η. If
program, then
Ș
IJ is the parameter TOL in the
IJ
LI IJ ! _ IJ _ ӝ$ӝ’ LI IJ Comments
1.
Workspace may be explicitly provided, if desired, by use of L2VRR/DL2VRR. The reference is:
CALL L2VRR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV, ACOPY, WK)
The additional arguments are as follows:
ACOPY — NRA × NCA work array for the matrix A. If A is not needed, then A and ACOPY may
share the same storage locations.
WK — Work vector of length NRA + NCA + max(NRA, NCA) − 1.
2.
Informational error
Type
Code
Description
4
1
Convergence cannot be achieved for all the singular values and their corresponding singular vectors.
3.
When NRA is much greater than NCA, it might not be reasonable to store the whole matrix U. In this
case, IPATH with I = 2 allows a singular value factorization of A to be computed in which only the
first NCA columns of U are computed, and in many applications those are all that are needed.
4.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2VRR the leading dimension of ACOPY is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSVRR. Additional memory allocation for ACOPY and option value restoration
are done automatically in LSVRR. Users directly calling L2VRR can allocate additional space for
ACOPY and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies. There is no requirement that users change existing applications that use LSVRR or L2VRR.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSVRR temporarily replaces IVAL(2) by IVAL(1). The routine L2CRG computes the condition
number if IVAL(2) = 2. Otherwise L2CRG skips this computation. LSVRR restores the option.
Default values for the option are IVAL(*) = 1, 2.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix whose singular value decomposition is to be computed. (Input)
U0 — MXLDU by MXCOLU local matrix containing the local portions of the left singular vectors of the distributed matrix A. (Output)
U0 will not be referenced if I is equal to zero. In contrast to the LINPACK and LAPACK based versions
of LSVRR, U0 and A0 cannot share the same storage locations.
LSVRR
Chapter 1: Linear Systems
558
V0 — MXLDV by MXCOLV local matrix containing the local portions of the right singular vectors of the distributed matrix A. (Output)
V0 will not be referenced if J is equal to zero. In contrast to the LINPACK and LAPACK based versions
of LSVRR, V0 and A0 cannot share the same storage locations.
Furthermore, the optional arguments NRA, NCA, LDA, LDU and LDV describe properties of the local arrays A0,
U0, and V0, respectively. For example, NRA is the number of rows in matrix A0 which defaults to
NRA = size (A0,1). The remaining arguments IPATH, S, TOL and IRANK are global and are the same as
described for the standard version of the routine.
In the argument descriptions above, MXLDA, MXCOL, MXLDU, MXCOLU, MXLDV and MXCOLV can be obtained
through a call to ScaLAPACK_GETDIM (Chapter 11, “Utilities”) after a call to ScaLAPACK_SETUP (Chapter 11,
“Utilities”) has been made. If MXLDA or MXCOL is equal to 0, then A0 should be defined as an array of nonzero
size, e.g., a 1 by 1 array. The same applies to the MXLDU/MXCOLU and MXLDV/MXCOLV pairs, respectively. See
the ScaLAPACK Example below.
Examples
Example 1
This example computes the singular value decomposition of a 6 × 4 matrix A. The matrices U and V containing the left and right singular vectors, respectively, and the diagonal of ∑, containing singular values, are
printed. On some systems, the signs of some of the columns of U and V may be reversed.
USE IMSL_LIBRARIES
!
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
Declare variables
(NRA=6, NCA=4, LDA=NRA, LDU=NRA, LDV=NCA)
A(LDA,NCA), U(LDU,NRA), V(LDV,NCA), S(NCA)
Set values for A
A = (
(
(
(
(
(
1
3
4
2
1
1
2
2
3
1
5
2
1
1
1
3
2
2
4
3
4
1
2
3
)
)
)
)
)
)
DATA A/1., 3., 4., 2., 1., 1., 2., 2., 3., 1., 5., 2., 3*1., &
3., 2., 2., 4., 3., 4., 1., 2., 3./
!
!
!
Compute all singular vectors
IPATH = 11
TOL
= AMACH(4)
TOL
= 10.*TOL
CALL LSVRR(A, IPATH, S, TOL=TOL, IRANK=IRANK, U=U, V=V)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, *) ’IRANK = ’, IRANK
CALL WRRRN (’U’, U, NRA, NCA)
CALL WRRRN (’S’, S, 1, NCA, 1)
CALL WRRRN (’V’, V)
LSVRR
Chapter 1: Linear Systems
559
!
END
Output
IRANK =
4
U
1
2
3
4
5
6
1
-0.3805
-0.4038
-0.5451
-0.2648
-0.4463
-0.3546
2
0.1197
0.3451
0.4293
-0.0683
-0.8168
-0.1021
3
0.4391
-0.0566
0.0514
-0.8839
0.1419
-0.0043
4
-0.5654
0.2148
0.4321
-0.2153
0.3213
-0.5458
S
1
11.49
1
2
3
4
1
-0.4443
-0.5581
-0.3244
-0.6212
2
3.27
3
2.65
V
2
0.5555
-0.6543
-0.3514
0.3739
4
2.09
3
-0.4354
0.2775
-0.7321
0.4444
4
0.5518
0.4283
-0.4851
-0.5261
ScaLAPACK Example
The previous example is repeated here as a distributed example. This example computes the singular value
decomposition of a 6 × 4 matrix A. The matrices U and V containing the left and right singular vectors,
respectively, and the diagonal of S, containing singular values, are printed. On some systems, the signs of
some of the columns of U and V may be reversed..
USE LSVRR_INT
USE WRRRN_INT
USE AMACH_INT
USE UMACH_INT
USE MPI_SETUP_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE 'mpif.h'
!
Declare variables
INTEGER :: DESCA(9), DESCU(9), DESCV(9), MXLDV, &
MXCOLV, NSZ, NSZP1, MXLDU, MXCOLU
INTEGER :: INFO, MXCOL, MXLDA, IPATH, IRANK, NOUT
REAL :: TOL
REAL, ALLOCATABLE :: A(:,:),U(:,:), V(:,:), S(:)
REAL, ALLOCATABLE :: A0(:,:), U0(:,:), V0(:,:)
INTEGER, PARAMETER :: NRA=6, NCA=4
!
NSZ = MIN(NRA,NCA)
NSZP1 = MIN(NRA+1,NCA)
!
Set up for MPI
LSVRR
Chapter 1: Linear Systems
560
!
!
!
!
!
!
!
!
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(NRA,NCA), U(NRA,NSZ), V(NCA,NSZ))
Set values for A
A(1,:) = (/ 1.0, 2.0, 1.0, 4.0/)
A(2,:) = (/ 3.0, 2.0, 1.0, 3.0/)
A(3,:) = (/ 4.0, 3.0, 1.0, 4.0/)
A(4,:) = (/ 2.0, 1.0, 3.0, 1.0/)
A(5,:) = (/ 1.0, 5.0, 2.0, 2.0/)
A(6,:) = (/ 1.0, 2.0, 2.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDU, MXCOLU, MXLDV, AND MXCOLV
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(NRA, NSZ, MP_MB, MP_NB, MXLDU, MXCOLU)
CALL SCALAPACK_GETDIM(NCA, NSZ, MP_MB, MP_NB, MXLDV, MXCOLV)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MAX(1,MXLDA), INFO)
CALL DESCINIT(DESCU, NRA, NSZ, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MAX(1,MXLDU), INFO)
CALL DESCINIT(DESCV, NCA, NSZ, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MAX(1,MXLDV), INFO)
Allocate space for the local arrays and
array S
IF (MXLDA .EQ. 0 .OR. MXCOL .EQ. 0) THEN
ALLOCATE (A0(1,1))
ELSE
ALLOCATE (A0(MXLDA,MXCOL))
END IF
IF (MXLDU .EQ. 0 .OR. MXCOLU .EQ. 0) THEN
ALLOCATE (U0(1,1))
ELSE
ALLOCATE (U0(MXLDU,MXCOLU))
END IF
IF (MXLDV .EQ. 0 .OR. MXCOLV .EQ. 0) THEN
ALLOCATE (V0(1,1))
ELSE
ALLOCATE (V0(MXLDV,MXCOLV))
END IF
ALLOCATE(S(NSZP1))
!
!
!
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Compute all singular vectors
IPATH = 11
TOL = AMACH(4)
TOL = 10.0 * TOL
CALL LSVRR (A0, IPATH, S, TOL=TOL, IRANK=IRANK, U=U0, V=V0)
Unmap the results from the distributed
LSVRR
Chapter 1: Linear Systems
561
!
!
!
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(U0, DESCU, U)
CALL SCALAPACK_UNMAP(V0, DESCV, V)
Print results.
Only Rank=0 has the singular vectors.
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT, *) 'IRANK = ', IRANK
CALL WRRRN ('U', U)
CALL WRRRN ('S', S, 1, NSZ, 1)
CALL WRRRN ('V', V)
ENDIF
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP('FINAL')
END
!
!
!
!
Output
IRANK =
4
U
2
-0.1197
-0.3451
-0.4293
0.0683
0.8168
0.1021
1
2
3
4
5
6
1
-0.3805
-0.4038
-0.5451
-0.2648
-0.4463
-0.3546
3
-0.4391
0.0566
-0.0514
0.8839
-0.1419
0.0043
4
0.5654
-0.2148
-0.4321
0.2153
-0.3213
0.5458
S
1
11.49
1
2
3
4
1
-0.4443
-0.5581
-0.3244
-0.6212
2
3.27
3
2.65
V
2
-0.5555
0.6543
0.3514
-0.3739
4
2.09
3
0.4354
-0.2775
0.7321
-0.4444
4
-0.5518
-0.4283
0.4851
0.5261
LSVRR
Chapter 1: Linear Systems
562
LSVCR
more...
Computes the singular value decomposition of a complex matrix.
Required Arguments
A — Complex NRA by NCA matrix whose singular value decomposition is to be computed. (Input)
IPATH — Integer flag used to control the computation of the singular vectors. (Input)
IPATH has the decimal expansion IJ such that:
I=0 means do not compute the left singular vectors.
I=1 means return the NCA left singular vectors in U.
I=2 means return only the min(NRA, NCA) left singular vectors in U.
J=0 means do not compute the right singular vectors.
J=1 means return the right singular vectors in V.
For example, IPATH = 20 means I = 2 and J = 0.
S — Complex vector of length min(NRA + 1, NCA) containing the singular values of A in descending order
of magnitude in the first min(NRA, NCA) positions. (Output)
Optional Arguments
NRA — Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
TOL — Real scalar containing the tolerance used to determine when a singular value is negligible. (Input)
If TOL is positive, then a singular value SI is considered negligible if SI ≤ TOL. If TOL is negative,
then a singular value SI is considered negligible if SI ≤ ∣TOL∣*(Infinity norm of A). In this case ǀTOLǀ
should generally contain an estimate of the level of relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK — Integer scalar containing an estimate of the rank of A. (Output)
U — Complex matrix, NRA by NRA if I = 1, or NRA by min(NRA, NCA) if I = 2, containing the left singular
vectors of A. (Output)
U will not be referenced if I is equal to zero. If NRA ≤ NCA or IPATH = 2, then U can share the same
storage locations as A.
LDU — Leading dimension of U exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDU = size (U,1).
LSVCR
Chapter 1: Linear Systems
563
V — Complex NCA by NCA matrix containing the right singular vectors of A. (Output)
V will not be referenced if J is equal to zero. If NCA is less than or equal to NRA, then V can share the
same storage locations as A; however U and V cannot both coincide with A simultaneously.
LDV — Leading dimension of V exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDV = size (V,1).
FORTRAN 90 Interface
Generic:
CALL LSVCR (A, IPATH, S [, …])
Specific:
The specific interface names are S_LSVCR and D_LSVCR.
FORTRAN 77 Interface
Single:
CALL LSVCR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV)
Double:
The double precision name is DLSVCR.
Description
The underlying code of routine LSVCR is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Let n = NRA (the number of rows in A) and let p = NCA (the number of columns in A).For any n × p matrix A
there exists an n × n orthogonal matrix U, and a p × p orthogonal matrix V such that
7
8 $9
Ȉ
Ȉ
LI Q • S
LI Q ” S
where ∑ = diag(σ1, …, σm), and m = min(n, p). The scalars σ1 ≥ σ2 ≥ … ≥ 0 are called the singular values of
A. The columns of U are called the left singular vectors of A. The columns of V are called the right singular vectors of A.
The estimated rank of A is the number of σk which are larger than a tolerance η. If τ is the parameter TOL in
the program, then
Ș
IJ
LI IJ ! _ IJ _ ӝ $ӝ’ LI IJ Comments
1.
Workspace may be explicitly provided, if desired, by use of L2VCR/DL2VCR. The reference is
CALL L2VCR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV,
ACOPY, WK)
The additional arguments are as follows:
LSVCR
Chapter 1: Linear Systems
564
ACOPY — NRA * NCA complex work array of length for the matrix A. If A is not needed, then A
and ACOPY can share the same storage locations.
WK — Complex work vector of length NRA + NCA + max(NRA, NCA) - 1.
2.
Informational error
Type
Code
Description
4
1
Convergence cannot be achieved for all the singular values and their corresponding singular vectors.
3.
When NRA is much greater than NCA, it might not be reasonable to store the whole matrix U. In this
case IPATH with I = 2 allows a singular value factorization of A to be computed in which only the first
NCA columns of U are computed, and in many applications those are all that are needed.
4.
Integer Options with Chapter 11 Options Manager
16
This option uses four values to solve memory bank conflict (access inefficiency) problems. In
routine L2VCR the leading dimension of ACOPY is increased by IVAL(3) when N is a multiple of
IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2),
respectively, in LSVCR. Additional memory allocation for ACOPY and option value restoration
are done automatically in LSVCR. Users directly calling L2VCR can allocate additional space for
ACOPY and set IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause inefficiencies. There is no requirement that users change existing applications that use LSVCR or L2VCR.
Default values for the option are IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be computed. Routine
LSVCR temporarily replaces IVAL(2) by IVAL(1). The routine L2CCG computes the condition
number if IVAL(2) = 2. Otherwise L2CCG skips this computation. LSVCR restores the option.
Default values for the option are IVAL(*) = 1, 2.
Example
This example computes the singular value decomposition of a 6 × 3 matrix A. The matrices U and V containing the left and right singular vectors, respectively, and the diagonal of ∑, containing singular values, are
printed. On some systems, the signs of some of the columns of U and V may be reversed.
USE IMSL_LIBRARIES
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
(NRA=6, NCA=3, LDA=NRA, LDU=NRA, LDV=NCA)
A(LDA,NCA), U(LDU,NRA), V(LDV,NCA), S(NCA)
Set values for A
A = (
(
(
(
(
(
1+2i
3-2i
4+3i
2-1i
1-5i
1+2i
3+2i
2-4i
-2+1i
3+0i
2-5i
4-2i
1-4i
1+3i
1+4i
3-1i
2+2i
2-3i
)
)
)
)
)
)
DATA A/(1.0,2.0), (3.0,-2.0), (4.0,3.0), (2.0,-1.0), (1.0,-5.0), &
(1.0,2.0), (3.0,2.0), (2.0,-4.0), (-2.0,1.0), (3.0,0.0), &
(2.0,-5.0), (4.0,-2.0), (1.0,-4.0), (1.0,3.0), (1.0,4.0), &
(3.0,-1.0), (2.0,2.0), (2.0,-3.0)/
LSVCR
Chapter 1: Linear Systems
565
!
!
Compute all singular vectors
IPATH = 11
TOL
= AMACH(4)
TOL
= 10. * TOL
CALL LSVCR(A, IPATH, S, TOL = TOL, IRANK=IRANK, U=U, V=V)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, *) ’IRANK = ’, IRANK
CALL WRCRN (’U’, U, NRA, NCA)
CALL WRCRN (’S’, S, 1, NCA, 1)
CALL WRCRN (’V’, V)
!
!
END
Output
IRANK =
3
U
1
2
3
4
5
6
(
(
(
(
(
(
1
0.1968, 0.2186)
0.3443,-0.3542)
0.1457, 0.2307)
0.3016,-0.0844)
0.2283,-0.6008)
0.2876,-0.0350)
2
( 0.5011, 0.0217)
(-0.2933, 0.0248)
(-0.5424, 0.1381)
( 0.2157, 0.2659)
(-0.1325, 0.1433)
( 0.4377,-0.0400)
3
(-0.2007,-0.1003)
( 0.1155,-0.2338)
(-0.4361,-0.4407)
(-0.0523,-0.0894)
( 0.3152,-0.0090)
( 0.0458,-0.6205)
S
( 11.77,
1
0.00)
(
9.30,
2
0.00)
(
4.99,
3
0.00)
V
1
2
3
1
( 0.6616, 0.0000)
( 0.7355, 0.0379)
( 0.0507,-0.1317)
2
(-0.2651, 0.0000)
( 0.3850,-0.0707)
( 0.1724, 0.8642)
3
(-0.7014, 0.0000)
( 0.5482, 0.0624)
(-0.0173,-0.4509)
LSVCR
Chapter 1: Linear Systems
566
LSGRR
more...
more...
Computes the generalized inverse of a real matrix.
Required Arguments
A — NRA by NCA matrix whose generalized inverse is to be computed. (Input)
GINVA — NCA by NRA matrix containing the generalized inverse of A. (Output)
Optional Arguments
NRA — Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA — Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program.
(Input)
Default: LDA = size (A,1).
TOL — Scalar containing the tolerance used to determine when a singular value (from the singular value
decomposition of A) is negligible. (Input)
If TOL is positive, then a singular value σi considered negligible if σi ≤ TOL . If TOL is negative, then a
singular value σi considered negligible if σi ≤ ∣TOL∣ * ∥A∥∞. In this case, |TOL| generally contains an
estimate of the level of the relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK — Scalar containing an estimate of the rank of A. (Output)
LDGINV — Leading dimension of GINVA exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDGINV = size (GINV,1).
FORTRAN 90 Interface
Generic:
CALL LSGRR (A, GINVA [, …])
Specific:
The specific interface names are S_LSGRR and D_LSGRR.
FORTRAN 77 Interface
Single:
CALL LSGRR (NRA, NCA, A, LDA, TOL, IRANK, GINVA, LDGINV)
Double:
The double precision name is DLSGRR.
LSGRR
Chapter 1: Linear Systems
567
ScaLAPACK Interface
Generic:
CALL LSGRR (A0, GINVA0 [, …])
Specific:
The specific interface names are S_LSGRR and D_LSGRR.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
Let k = IRANK, the rank of A; let n = NRA, the number of rows in A; let p = NCA, the number of columns in A;
and let
‚
$
*,19
be the generalized inverse of A.
To compute the Moore-Penrose generalized inverse, the routine LSVRR is first used to compute the singular
value decomposition of A. A singular value decomposition of A consists of an n × n orthogonal matrix U, a
p × p orthogonal matrix V and a diagonal matrix ∑ = diag(σ1,…, σm), m = min(n, p), such that UT AV = [∑, 0]
if n ≤ p and UT AV = [∑, 0]T if n ≥ p. Only the first p columns of U are computed. The rank k is estimated by
counting the number of nonnegligible σi.
The matrices U and V can be partitioned as U = (U1, U2) and V = (V1, V2) where both U1 and V1 are k × k
matrices. Let∑1 = diag(σ1, …, σk). The Moore-Penrose generalized inverse of A is
A† = V1∑1-1U1T
The underlying code of routine LSGRR is based on either LINPACK, LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2GRR/DL2GRR. The reference is:
CALL L2GRR (NRA, NCA, A, LDA, TOL, IRANK, GINVA, LDGINV, WKA, WK)
The additional arguments are as follows:
WKA — Work vector of length NRA * NCA used as workspace for the matrix A. If A is not needed,
WKA and A can share the same storage locations.
WK — Work vector of length LWK where LWK is equal to
NRA2 + NCA2 + min(NRA + 1, NCA) + NRA + NCA + max(NRA, NCA) − 2.
2.
Informational error
Type
Code
Description
4
1
Convergence cannot be achieved for all the singular values and their corresponding singular vectors.
LSGRR
Chapter 1: Linear Systems
568
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains
the matrix for which the generalized inverse is to be computed. (Input)
GINVA0 — MXLDG by MXCOLG local matrix containing the local portions of the distributed
GINVA. GINVA contains the generalized inverse of matrix A. (Output)
matrix
All other arguments are global and are the same as described for the standard version of the routine. In the
argument descriptions above, MXLDA, MXCOL, MXLDG, and MXCOLG can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Chapter 11, ”Utilities”) has been
made. See the ScaLAPACK Example below.
Examples
Example
This example computes the generalized inverse of a 3 × 2 matrix A. The rank k = IRANK and the inverse
‚
$
*,19
are printed.
USE IMSL_LIBRARIES
!
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
(NRA=3, NCA=2, LDA=NRA, LDGINV=NCA)
A(LDA,NCA), GINV(LDGINV,NRA)
Set values for A
A = (
1
(
1
( 100
DATA A/1., 1., 100., 0., 1., -50./
0
1
-50
)
)
)
Compute generalized inverse
TOL = AMACH(4)
TOL = 10.*TOL
CALL LSGRR (A, GINV,TOL=TOL, IRANK=IRANK)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, *) ’IRANK = ’, IRANK
CALL WRRRN (’GINV’, GINV)
!
END
Output
IRANK =
2
LSGRR
Chapter 1: Linear Systems
569
GINV
1
2
1
0.1000
0.2000
2
0.3000
0.6000
3
0.0060
-0.0080
ScaLAPACK Example
This example computes the generalized inverse of a 6 × 4 matrix A as a distributed example. The rank
k = IRANK and the inverse
‚
$
*,19
are printed.
USE MPI_SETUP_INT
USE IMSL_LIBRARIES
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
IRANK, LDA, NCA, NRA, DESCA(9), DESCG(9), &
LDGINV, MXLDG, MXCOLG, NOUT
INTEGER
INFO, MXCOL, MXLDA
REAL
TOL, AMACH
REAL, ALLOCATABLE ::
A(:,:),GINVA(:,:)
REAL, ALLOCATABLE ::
A0(:,:), GINVA0(:,:)
PARAMETER (NRA=6, NCA=4, LDA=NRA, LDGINV=NCA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), GINVA(NCA,NRA))
Set values for A
A(1,:) = (/ 1.0, 2.0, 1.0, 4.0/)
A(2,:) = (/ 3.0, 2.0, 1.0, 3.0/)
A(3,:) = (/ 4.0, 3.0, 1.0, 4.0/)
A(4,:) = (/ 2.0, 1.0, 3.0, 1.0/)
A(5,:) = (/ 1.0, 5.0, 2.0, 2.0/)
A(6,:) = (/ 1.0, 2.0, 2.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDG, and MXCOLG
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
INTEGER
!
!
!
!
!
!
!
!
CALL SCALAPACK_GETDIM(NCA, NRA, MP_NB, MP_MB, MXLDG, MXCOLG)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, &
INFO)
CALL DESCINIT(DESCG, NCA, NRA, MP_NB, MP_MB, 0, 0, MP_ICTXT, MXLDG, &
INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), GINVA0(MXLDG,MXCOLG))
LSGRR
Chapter 1: Linear Systems
570
!
!
!
!
!
!
!
!
!
!
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Compute the generalized inverse
TOL = AMACH(4)
TOL = 10. * TOL
CALL LSGRR (A0, GINVA0, TOL=TOL, IRANK=IRANK)
Unmap the results from the distributed
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(GINVA0, DESCG, GINVA)
Print results.
Only Rank=0 has the solution, GINVA
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT, *) ‘IRANK = ‘,IRANK
CALL WRRRN (‘GINVA’, GINVA)
ENDIF
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
LSGRR
Chapter 1: Linear Systems
571
LSGRR
Chapter 1: Linear Systems
572
Chapter 2: Eigensystem Analysis
Routines
2.1.
2.1.1
2.1.2
2.1.3
2.2.
2.2.1
2.2.2
2.2.3
2.2.4
Eigenvalue Decomposition
Computes the eigenvalues of a self-adjoint matrix . . . . . . . . . . . . . LIN_EIG_SELF
Computes the eigenvalues of an n × n matrix . . . . . . . . . . . . . . . . . . LIN_EIG_GEN
Computes the generalized eigenvalues of an n × n matrix
pencil, Av = λBv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LIN_GEIG_GEN
581
588
597
Eigenvalues and (Optionally) Eigenvectors of Ax = λx
Real General Problem Ax = λx
All eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLRG
605
All eigenvalues and eigenvectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVCRG
608
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EPIRG
611
Complex General Problem Ax = λx
All eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLCG
All eigenvalues and eigenvectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVCCG
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EPICG
Real Symmetric Problem Ax = λx
All eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLSF
All eigenvalues and eigenvectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVCSF
Extreme eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVASF
Extreme eigenvalues and their eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . EVESF
Eigenvalues in an interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVBSF
Eigenvalues in an interval and their eigenvectors . . . . . . . . . . . . . . . . . . . . . EVFSF
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .EPISF
613
616
619
621
623
626
628
631
634
637
Real Band Symmetric Matrices in Band Storage Mode
All eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLSB
639
All eigenvalues and eigenvectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVCSB
641
Chapter 2: Eigensystem Analysis
573
Extreme eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVASB
2.2.5
2.2.6
2.2.7
Extreme eigenvalues and their eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . EVESB
647
Eigenvalues in an interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVBSB
650
Eigenvalues in an interval and their eigenvectors . . . . . . . . . . . . . . . . . . . . . EVFSB
653
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .EPISB
656
Complex Hermitian Matrices
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLHF
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVCHF
Extreme eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVAHF
Extreme eigenvalues and their eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . EVEHF
Eigenvalues in an interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVBHF
Eigenvalues in an interval and their eigenvectors . . . . . . . . . . . . . . . . . . . . . EVFHF
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .EPIHF
667
670
673
676
Complex Upper Hessenberg Matrices
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLCH
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .EVCCH
683
2.3.1
Real General Problem Ax = λBx
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .GVLRG
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GVCRG
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPIRG
2.4.
661
664
678
Eigenvalues and (Optionally) Eigenvectors of Ax = λBx
2.3.3
658
Real Upper Hessenberg Matrices
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EVLRH
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .EVCRH
2.3.
2.3.2
644
Complex General Problem Ax = λBx
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .GVLCG
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GVCCG
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPICG
Real Symmetric Problem Ax = λBx
All eigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GVLSP
All eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .GVCSP
Performance index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPISP
Eigenvalues and Eigenvectors Computed with ARPACK
Fortran 2003 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Base Class . . . . . . . . . . . . . . . . . . . . . . . . . . . The Base Class ARPACKBASE
Real Symmetric Problem Ax = λBx . . . . . . . . . . . . . . . . . . .ARPACK_SYMMETRIC
Real singular value decomposition AV = US . . . . . . . . . . . . . . . . . . . ARPACK_SVD
Real General Problem Ax = λBx . . . . . . . . . . . . . . . . .ARPACK_NONSYMMETRIC
Complex General Problem Ax = λBx . . . . . . . . . . . . . . . . . . . .ARPACK_COMPLEX
680
685
688
691
695
697
700
703
705
708
711
713
715
716
731
739
747
Chapter 2: Eigensystem Analysis
574
Usage Notes
This chapter includes routines for linear eigensystem analysis. Many of these are for matrices with special
properties. Some routines compute just a portion of the eigensystem. Use of the appropriate routine can substantially reduce computing time and storage requirements compared to computing a full eigensystem for a
general complex matrix.
An ordinary linear eigensystem problem is represented by the equation Ax = λx where A denotes an n × n
matrix. The value λ is an eigenvalue and x ≠ 0 is the corresponding eigenvector. The eigenvector is determined
up to a scalar factor. In all routines, we have chosen this factor so that x has Euclidean length with value one,
and the component of x of smallest index and largest magnitude is positive. In case x is a complex vector, this
largest component is real and positive.
Similar comments hold for the use of the remaining Level 1 routines in the following tables in those cases
where the second character of the Level 2 routine name is no longer the character "2".
A generalized linear eigensystem problem is represented by Ax = λBx where A and B are n × n matrices. The
value λ is an eigenvalue, and x is the corresponding eigenvector. The eigenvectors are normalized in the
same manner as for the ordinary eigensystem problem. The linear eigensystem routines have names that
begin with the letter “E”. The generalized linear eigensystem routines have names that begin with the letter
“G”. This prefix is followed by a two-letter code for the type of analysis that is performed. That is followed by
another two-letter suffix for the form of the coefficient matrix. The following tables summarize the names of
the eigensystem routines.
Usage Notes
Chapter 2: Eigensystem Analysis
575
Symmetric and Hermitian Eigensystems
Symmetric
Full
Symmetric
Band
Hermitian
Full
All eigenvalues
EVLSF
EVLSB
EVLHF
All eigenvalues and
eigenvectors
EVCSF
EVCSB
EVCHF
Extreme eigenvalues
EVASF
EVASB
EVAHF
Extreme eigenvalues and
eigenvectors
EVESF
EVESB
EVEHF
Eigenvalues in an interval
EVBSF
EVBSB
EVBHF
Eigenvalues and eigevectors in EVFSF
an interval
EVFSB
EVFHF
Performance index
EPISB
EPIHF
EPISF
General Eigensystems
Real
General
Complex
General
Real
Hessenberg
Complex
Hessenberg
EVLRG
EVLCG
EVLRH
EVLCH
All eigenvalues and EVCRG
eigenvectors
EVCCG
EVCRH
EVCCH
Performance index
EPICG
EPIRG
EPICG
All eigenvalues
EPIRG
Generalized Eigensystems Ax = λBx
Real
General
Complex
General
A Symmetric
B Positive
Definite
All eigenvalues
GVLRG
GVLCG
GVLSP
All eigenvalues and
eigenvectors
GVCRG
GVCCG
GVCSP
Performance index
GPIRG
GPICG
GPISP
Error Analysis and Accuracy
The remarks in this section are for the ordinary eigenvalue problem. Except in special cases, routines will not
return the exact eigenvalue-eigenvector pair for the ordinary eigenvalue problem Ax = λx. The computed
pair
Usage Notes
Chapter 2: Eigensystem Analysis
576
a
a
[
Ȝ
is an exact eigenvector-eigenvalue pair for a “nearby” matrix A + E. Information about E is known only in
terms of bounds of the form ∥E∥2≤f(n)∥A∥2ɛ. The value of f(n) depends on the algorithm but is typically a
small fractional power of n. The parameter ɛ is the machine precision. By a theorem due to Bauer and Fike
(see Golub and Van Loan [1989, page 342]),
a
PLQӛȜ í Ȝӛ ” ț ; ӝ(ӝ
IRU DOO Ȝ LQ ı $
where σ(A) is the set of all eigenvalues of A (called the spectrum of A), X is the matrix of eigenvectors, ∥·∥2 is
the 2-norm, and κ(X) is the condition number of X defined as κ(X) = ∥X∥2∥ X-1∥2. If A is a real symmetric or
complex Hermitian matrix, then its eigenvector matrix X is respectively orthogonal or unitary. For these
matrices, κ(X) = 1.
The eigenvalues
a
ȜM
and eigenvectors
[a M
computed by EVC** can be checked by computing their performance index
mance index is defined by Smith et al. (1976, pages 124− 126) to be
IJ using EPI**. The perfor-
a
ӝ $[a M í Ȝ M[a Mӝ
IJ
PD[
a
” M”Q Qѓӝ $ӝӝ [ Mӝ
No significance should be attached to the factor of 10 used in the denominator. For a real vector x, the symbol
∥x∥1 represents the usual 1-norm of x. For a complex vector x, the symbol ∥x∥1 is defined by
1
ӝ [ӝ
™
ӛ щ [N ӛ ӛ о [N ӛ
N The performance index
IJ is related to the error analysis because
a
a
a
a
ӝ([ Mӝ ӝ $[ M í Ȝ M [ Mӝ
where E is the “nearby” matrix discussed above.
IJ is machine and precision dependent, the performance of an eigensystem analysis
routine is defined as excellent if IJ < 1, good if 1 ≤ IJ ≤ 100, and poor if IJ > 100. This is an arbitrary definition, but large values of IJ can serve as a warning that there is a blunder in the calculation. There are also
While the exact value of
similar routines GPI** to compute the performance index for generalized eigenvalue problems.
Usage Notes
Chapter 2: Eigensystem Analysis
577
If the condition number κ(X) of the eigenvector matrix X is large, there can be large errors in the eigenvalues
even if IJ is small. In particular, it is often difficult to recognize near multiple eigenvalues or unstable mathematical problems from numerical results. This facet of the eigenvalue problem is difficult to understand: A
user often asks for the accuracy of an individual eigenvalue. This can be answered approximately by computing the condition number of an individual eigenvalue. See Golub and Van Loan (1989, pages 344-345). For
matrices A such that the computed array of normalized eigenvectors X is invertible, the condition number of
λj is κj ≡ the Euclidean length of row j of the inverse matrix X-1. Users can choose to compute this matrix
with routine LINCG, see Chapter 1, “Linear Systems”. An approximate bound for the accuracy of a computed
eigenvalue is then given by κjɛ∥A∥. To compute an approximate bound for the relative accuracy of an eigenvalue, divide this bound by ∣λj∣.
Usage Notes
Chapter 2: Eigensystem Analysis
578
Generalized Eigenvalue Problems
The generalized eigenvalue problem Ax = λBx is often difficult for users to analyze because it is frequently
ill-conditioned. There are occasionally changes of variables that can be performed on the given problem to
ease this ill-conditioning. Suppose that B is singular but A is nonsingular. Define the reciprocal µ = λ-1. Then,
the roles of A and B are interchanged so that the reformulated problem Bx = µAx is solved. Those generalized
eigenvalues µj = 0 correspond to eigenvalues λj = ∞. The remaining
ȜM
ȝí
M
The generalized eigenvectors for λj correspond to those for µj. Other reformulations can be made: If B is
nonsingular, the user can solve the ordinary eigenvalue problem Cx ≡ B-1 Ax = λx. This is not recommended
as a computational algorithm for two reasons. First, it is generally less efficient than solving the generalized
problem directly. Second, the matrix C will be subject to perturbations due to ill-conditioning and rounding
errors when computing B-1 A. Computing the condition numbers of the eigenvalues for C may, however, be
helpful for analyzing the accuracy of results for the generalized problem.
There is another method that users can consider to reduce the generalized problem to an alternate ordinary
problem. This technique is based on first computing a matrix decomposition B = PQ, where both P and Q are
matrices that are “simple” to invert. Then, the given generalized problem is equivalent to the ordinary eigenvalue problem Fy = λy. The matrix F ≡ P-1 AQ-1. The unnormalized eigenvectors of the generalized problem
are given by x = Q-1 y. An example of this reformulation is used in the case where A and B are real and symmetric with B positive definite. The IMSL routines GVLSP and GVCSP use P = RT and Q = R where R is an
upper triangular matrix obtained from a Cholesky decomposition, B = RTR. The matrix F = R-T AR-1 is symmetric and real. Computation of the eigenvalue-eigenvector expansion for F is based on routine EVCSF.
Generalized Eigenvalue Problems
Chapter 2: Eigensystem Analysis
579
Using ARPACK for Ordinary and Generalized Eigenvalue
Problems
ARPACK consists of a set of Fortran 77 subroutines which use the Arnoldi method (Sorensen, 1992) to solve
eigenvalue problems. ARPACK is well suited for large structured eigenvalue problems where structured
means that a matrix-vector product w← Av requires O(n) rather than the usual O(n2) floating point
operations.
The suite of features that we have implemented from ARPACK are described in the work of Lehoucq,
Sorensen and Yang, ARPACK Users’ Guide, SIAM Publications, (1998). Users will find access to this Guide
helpful. Due to the size of the package, we provide for the use of double precision real and complex arithmetic only.
The ARPACK computational algorithm computes a partial set of approximate eigenvalues or singular values
for various classes of problems. This includes the ordinary problem, Ax = λx, the generalized problem,
Ax = λBx, and the singular value decomposition, A = USVT.
The original API for ARPACK is a Reverse Communication Interface. This interface can be used as illustrated
in the Guide. However, we provide a Fortran 2003 interface to ARPACK that will be preferred by some users.
This is a forward communication interface based on user-written functions for matrix-vector products or linear equation solving steps required by the algorithms in ARPACK. It is not necessary that the linear
operators be expressed as dense or sparse matrices. That is permitted, but for some problems the best
approach is the ability to form a product of the operator with a vector.
The forward communication interface includes an argument of a user-extended derived type or class object.
The intent of producing this argument is that an extended type provides access to threaded user data or other
required information, including procedure pointers, for use in the user-written product functions. It also
hides information that can often be ignored with a first use.
Using ARPACK for Ordinary and Generalized Eigenvalue Problems
Chapter 2: Eigensystem Analysis
LIN_EIG_SELF
Computes the eigenvalues of a self-adjoint (i.e. real symmetric or complex Hermitian) matrix, A. Optionally,
the eigenvectors can be computed. This gives the decomposition A = VDVT, where V is an n × n orthogonal
matrix and D is a real diagonal matrix.
Required Arguments
A—
Array of size n × n containing the matrix. (Input [/Output])
D — Array of size n containing the eigenvalues. The values are in order of decreasing absolute value.
(Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size(A, 1)
v = v(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the n × n orthogonal matrix V.
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix; used for passing optional data to the
routine. The options are as follows:
Packaged Options for LIN_EIG_SELF
Option
Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
Lin_eig_self_set_small
1
s_, d_, c_, z_
Lin_eig_self_overwrite_input
2
s_, d_, c_, z_
Lin_eig_self_scan_for_NaN
3
s_, d_, c_, z_
Lin_eig_self_use_QR
4
s_, d_, c_, z_
Lin_eig_self_skip_Orth
5
s_, d_, c_, z_
Lin_eig_self_use_Gauss_elim
6
s_, d_, c_, z_
Lin_eig_self_set_perf_ratio
7
iopt(IO) = ?_options(?_lin_eig_self_set_small, Small)
If a denominator term is smaller in magnitude than the value Small, it is replaced by Small.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_eig_self_overwrite_input,
Do not save the input array A(:, :).
?_dummy)
iopt(IO) = ?_options(?_lin_eig_self_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true.
See the isNaN() function, Chapter 10.
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
581
Default: The array is not scanned for NaNs.
iopt(IO) = ?_options(?_lin_eig_use_QR, ?_dummy)
Uses a rational QR algorithm to compute eigenvalues. Accumulate the eigenvectors using this algorithm.
Default: the eigenvectors computed using inverse iteration
iopt(IO) = ?_options(?_lin_eig_skip_Orth, ?_dummy)
If the eigenvalues are computed using inverse iteration, skips the final orthogonalization of the vectors. This will result in a more efficient computation but the eigenvectors, while a complete set, may be
far from orthogonal.
Default: the eigenvectors are normally orthogonalized if obtained using inverse iteration.
iopt(IO) = ?_options(?_lin_eig_use_Gauss_elim, ?_dummy)
If the eigenvalues are computed using inverse iteration, uses standard elimination with partial pivoting to solve the inverse iteration problems.
Default: the eigenvectors computed using cyclic reduction
iopt(IO) = ?_options(?_lin_eig_self_set_perf_ratio, perf_ratio)
Uses residuals for approximate normalized eigenvectors if they have a performance index no larger
than perf_ratio. Otherwise an alternate approach is taken and the eigenvectors are computed again:
Standard elimination is used instead of cyclic reduction, or the standard QR algorithm is used as a
backup procedure to inverse iteration. Larger values of perf_ratio are less likely to cause these exceptions.
Default: perf_ratio = 4
FORTRAN 90 Interface
Generic:
CALL LIN_EIG_SELF (A, D [,…])
Specific:
The specific interface names are S_LIN_EIG_SELF, D_LIN_EIG_SELF,
C_LIN_EIG_SELF, and Z_LIN_EIG_SELF.
Description
Routine LIN_EIG_SELF is an implementation of the QR algorithm for self-adjoint matrices. An orthogonal
similarity reduction of the input matrix to self-adjoint tridiagonal form is performed. Then, the eigenvalueeigenvector decomposition of a real tridiagonal matrix is calculated. The expansion of the matrix as AV = VD
results from a product of these matrix factors. See Golub and Van Loan (1989, Chapter 8) for details.
Fatal, Terminal, and Warning Error Messages
See the messages.gls file for error messages for LIN_EIG_SELF. These error messages are numbered 81–90;
101–110; 121–129; 141–149.
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
582
Examples
Example 1: Computing Eigenvalues
The eigenvalues of a self-adjoint matrix are computed. The matrix A = C+CT is used, where C is random. The
magnitudes of eigenvalues of A agree with the singular values of A. Also, see operator_ex25, supplied
with the product examples.
use lin_eig_self_int
use lin_sol_svd_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_EIG_SELF.
integer, parameter :: n=64
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) :: A(n,n), b(n,0), D(n), S(n), x(n,0), y(n*n)
! Generate a random matrix and from it
! a self-adjoint matrix.
call rand_gen(y)
A = reshape(y,(/n,n/))
A = A + transpose(A)
! Compute the eigenvalues of the matrix.
call lin_eig_self(A, D)
! For comparison, compute the singular values.
call lin_sol_svd(A, b, x, nrhs=0, s=S)
! Check the results: Magnitude of eigenvalues should equal
! the singular values.
if (sum(abs(abs(D) - S)) <= &
sqrt(epsilon(one))*S(1)) then
write (*,*) 'Example 1 for LIN_EIG_SELF is correct.'
end if
end
Output
Example 1 for LIN_EIG_SELF is correct.
Example 2: Eigenvalue-Eigenvector Expansion of a Square Matrix
A self-adjoint matrix is generated and the eigenvalues and eigenvectors are computed. Thus,
A = VDVT, where V is orthogonal and D is a real diagonal matrix. The matrix V is obtained using an optional
argument. Also, see operator_ex26, Chapter 10.
use lin_eig_self_int
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
583
use rand_gen_int
implicit none
! This is Example 2 for LIN_EIG_SELF.
integer, parameter :: n=8
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) :: a(n,n), d(n), v_s(n,n), y(n*n)
! Generate a random self-adjoint matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
a = a + transpose(a)
! Compute the eigenvalues and eigenvectors.
call lin_eig_self(a, d, v=v_s)
! Check the results for small residuals.
if (sum(abs(matmul(a,v_s)-v_s*spread(d,1,n)))/d(1) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_EIG_SELF is correct.'
end if
end
Output
Example 2 for LIN_EIG_SELF is correct.
Example 3: Computing a few Eigenvectors with Inverse Iteration
A self-adjoint n × n matrix is generated and the eigenvalues, {di}, are computed. The eigenvectors associated
with the first k of these are computed using the self-adjoint solver, lin_sol_self, and inverse iteration.
With random right-hand sides, these systems are as follows:
(A -diI)vi = bi
The solutions are then orthogonalized as in Hanson et al. (1991) to comprise a partial decomposition
AV = VD where V is an n × k matrix resulting from the orthogonalized {νi} and D is the k × k diagonal matrix
of the distinguished eigenvalues. It is necessary to suppress the error message when the matrix is singular.
Since these singularities are desirable, it is appropriate to ignore the exceptions and not print the message
text. Also, see operator_ex27, supplied with the product examples.
use
use
use
use
lin_eig_self_int
lin_sol_self_int
rand_gen_int
error_option_packet
implicit none
! This is Example 3 for LIN_EIG_SELF.
integer i, j
integer, parameter :: n=64, k=8
real(kind(1d0)), parameter :: one=1d0, zero=0d0
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
584
real(kind(1d0))
real(kind(1d0))
v(n,k),
type(d_options)
big, err
:: a(n,n), b(n,1), d(n), res(n,k), temp(n,n), &
y(n*n)
:: iopti(2)=d_options(0,zero)
! Generate a random self-adjoint matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
a = a + transpose(a)
! Compute just the eigenvalues.
call lin_eig_self(a, d)
do i=1, k
! Define a temporary array to hold the matrices A - eigenvalue*I.
temp = a
do j=1, n
temp(j,j) = temp(j,j) - d(i)
end do
! Use packaged option to reset the value of a small diagonal.
iopti(1) = d_options(d_lin_sol_self_set_small,&
epsilon(one)*abs(d(i)))
! Use packaged option to skip singularity messages.
iopti(2) = d_options(d_lin_sol_self_no_sing_mess,&
zero)
call rand_gen(b(1:n,1))
call lin_sol_self(temp, b, v(1:,i:i),&
iopt=iopti)
end do
! Orthogonalize the eigenvectors.
do i=1, k
big = maxval(abs(v(1:,i)))
v(1:,i) = v(1:,i)/big
v(1:,i) = v(1:,i)/sqrt(sum(v(1:,i)**2))
if (i == k) cycle
v(1:,i+1:k) = v(1:,i+1:k) + &
spread(-matmul(v(1:,i),v(1:,i+1:k)),1,n)* &
spread(v(1:,i),2,k-i)
end do
do i=k-1, 1, -1
v(1:,i+1:k) = v(1:,i+1:k) + &
spread(-matmul(v(1:,i),v(1:,i+1:k)),1,n)* &
spread(v(1:,i),2,k-i)
end do
! Check the results for both orthogonality of vectors and small
! residuals.
res(1:k,1:k) = matmul(transpose(v),v)
do i=1,k
res(i,i)=res(i,i)-one
end do
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
585
err = sum(abs(res))/k**2
res = matmul(a,v) - v*spread(d(1:k),1,n)
if (err <= sqrt(epsilon(one))) then
if (sum(abs(res))/abs(d(1)) <= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_EIG_SELF is correct.'
end if
end if
end
Output
Example 3 for LIN_EIG_SELF is correct.
Example 4: Analysis and Reduction of a Generalized Eigensystem
A generalized eigenvalue problem is Ax = λBx, where A and B are n × n self-adjoint matrices. The matrix B is
positive definite. This problem is reduced to an ordinary self-adjoint eigenvalue problem Cy = λy by changing the variables of the generalized problem to an equivalent form. The eigenvalue-eigenvector
decomposition B = VSVT is first computed, labeling an eigenvalue too small if it is less than epsilon(1.d0).
The ordinary self-adjoint eigenvalue problem is Cy = λy provided that the rank of B, based on this definition
of Small, has the value n. In that case,
C = DVT AVD
where
D = S-1/2
The relationship between x and y is summarized as X = VDY, computed after the ordinary eigenvalue problem is solved for the eigenvectors Y of C. The matrix X is normalized so that each column has Euclidean
length of value one. This solution method is nonstandard for any but the most ill-conditioned matrices B. The
standard approach is to compute an ordinary self-adjoint problem following computation of the Cholesky
decomposition
B = RTR
where R is upper triangular. The computation of C can also be completed efficiently by exploiting its selfadjoint property. See Golub and Van Loan (1989, Chapter 8) for more information. Also, see
operator_ex28, Chapter 10.
use lin_eig_self_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_EIG_SELF.
integer i
integer, parameter :: n=64
real(kind(1e0)), parameter :: one=1d0
real(kind(1e0)) b_sum
real(kind(1e0)), dimension(n,n) :: A, B, C, D(n), lambda(n), &
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
586
S(n), vb_d, X, ytemp(n*n), res
! Generate random self-adjoint matrices.
call rand_gen(ytemp)
A = reshape(ytemp,(/n,n/))
A = A + transpose(A)
call rand_gen(ytemp)
B = reshape(ytemp,(/n,n/))
B = B + transpose(B)
b_sum = sqrt(sum(abs(B**2))/n)
! Add a scalar matrix so B is positive definite.
do i=1, n
B(i,i) = B(i,i) + b_sum
end do
! Get the eigenvalues and eigenvectors for B.
call lin_eig_self(B, S, v=vb_d)
! For full rank problems, convert to an ordinary self-adjoint
! problem. (All of these examples are full rank.)
if (S(n) > epsilon(one)) then
D = one/sqrt(S)
C = spread(D,2,n)*matmul(transpose(vb_d), &
matmul(A,vb_d))*spread(D,1,n)
! Get the eigenvalues and eigenvectors for C.
call lin_eig_self(C, lambda, v=X)
! Compute the generalized eigenvectors.
X = matmul(vb_d,spread(D,2,n)*X)
! Normalize the eigenvectors for the generalized problem.
X = X * spread(one/sqrt(sum(X**2,dim=2)),1,n)
res =
matmul(A,X) - &
matmul(B,X)*spread(lambda,1,n)
! Check the results.
if (sum(abs(res))/(sum(abs(A))+sum(abs(B))) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_EIG_SELF is correct.'
end if
end if
end
Output
Example 4 for LIN_EIG_SELF is correct.
LIN_EIG_SELF
Chapter 2: Eigensystem Analysis
587
LIN_EIG_GEN
more...
Computes the eigenvalues of an n × n matrix, A. Optionally, the eigenvectors of A or AT are computed. Using
the eigenvectors of A gives the decomposition AV = VE, where V is an n × n complex matrix of eigenvectors,
and E is the complex diagonal matrix of eigenvalues. Other options include the reduction of A to upper triangular or Schur form, reduction to block upper triangular form with 2 × 2 or unit sized diagonal block
matrices, and reduction to upper Hessenberg form.
Required Arguments
A—
Array of size n × n containing the matrix. (Input [/Output])
E — Array of size n containing the eigenvalues. These complex values are in order of decreasing absolute
value. The signs of imaginary parts of the eigenvalues are in no predictable order. (Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = SIZE(A, 1)
v = V(:,:) (Output)
Returns the complex array of eigenvectors for the matrix A.
v_adj = U(:,:) (Output)
Returns the complex array of eigenvectors for the matrix AT. Thus the residuals
6
ņ
$7 8 í 8 (
are small.
tri = T(:,:) (Output)
Returns the complex upper-triangular matrix T associated with the reduction of the matrix A to Schur
form. Optionally a unitary matrix W is returned in array V(:,:) such that the residuals Z = AW – WT
are small.
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing optional data to the
routine. The options are as follows:
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
588
Packaged Options for LIN_EIG_GEN
Option Prefix = ?
Option Name
Option Value
s_, d_, c_, z_
lin_eig_gen_set_small
1
s_, d_, c_, z_
lin_eig_gen_overwrite_input
2
s_, d_, c_, z_
lin_eig_gen_scan_for_NaN
3
s_, d_, c_, z_
lin_eig_gen_no_balance
4
s_, d_, c_, z_
lin_eig_gen_set_iterations
5
s_, d_, c_, z_
lin_eig_gen_in_Hess_form
6
s_, d_, c_, z_
lin_eig_gen_out_Hess_form
7
s_, d_, c_, z_
lin_eig_gen_out_block_form
8
s_, d_, c_, z_
lin_eig_gen_out_tri_form
9
s_, d_, c_, z_
lin_eig_gen_continue_with_V
10
s_, d_, c_, z_
lin_eig_gen_no_sorting
11
iopt(IO) = ?_options(?_lin_eig_gen_set_small, Small)
This is the tolerance used to declare off-diagonal values effectively zero compared with the size of the
numbers involved in the computation of a shift.
Default: Small = epsilon(), the relative accuracy of arithmetic
iopt(IO) = ?_options(?_lin_eig_gen_overwrite_input,
Does not save the input array A(:, :).
Default: The array is saved.
?_dummy)
iopt(IO) = ?_options(?_lin_eig_gen_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true..
See the isNaN() function, Chapter 10.
Default: The array is not scanned for NaNs.
iopt(IO) = ?_options(?_lin_eig_no_balance, ?_dummy)
The input matrix is not preprocessed searching for isolated eigenvalues followed by rescaling. See
Golub and Van Loan (1989, Chapter 7) for references. With some optional uses of the routine, this
option flag is required.
Default: The matrix is first balanced.
iopt(IO) = ?_options(?_lin_eig_gen_set_iterations, ?_dummy)
Resets the maximum number of iterations permitted to isolate each diagonal block matrix.
Default: The maximum number of iterations is 52.
iopt(IO) = ?_options(?_lin_eig_gen_in_Hess_form, ?_dummy)
The input matrix is in upper Hessenberg form. This flag is used to avoid the initial reduction phase
which may not be needed for some problem classes.
Default: The matrix is first reduced to Hessenberg form.
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
589
iopt(IO) = ?_options(?_lin_eig_gen_out_Hess_form, ?_dummy)
The output matrix is transformed to upper Hessenberg form, + . If the optional argument
“v=V(:,:)” is passed by the calling program unit, then the array V(:,:) contains an orthogonal
matrix Q1 such that
AQ1 - Q1H1 ≅ 0
Requires the simultaneous use of option ?_lin_eig_no_balance.
Default: The matrix is reduced to diagonal form.
iopt(IO) = ?_options(?_lin_eig_gen_out_block_form, ?_dummy)
The output matrix is transformed to upper Hessenberg form, H2, which is block upper triangular. The
dimensions of the blocks are either 2 × 2 or unit sized. Nonzero subdiagonal values of H2 determine
the size of the blocks. If the optional argument “v=V(:,:)” is passed by the calling program unit,
then the array V(:,:) contains an orthogonal matrix Q2, such that
AQ2 - Q2H2 ≅ 0
Requires the simultaneous use of option ?_lin_eig_no_balance.
Default: The matrix is reduced to diagonal form.
iopt(IO) = ?_options(?_lin_eig_gen_out_tri_form, ?_dummy)
The output matrix is transformed to upper-triangular form, T. If the optional argument “v=V(:,:)”
is passed by the calling program unit, then the array V(:,:) contains a unitary matrix W such that
AW – WT ≅ 0. The upper triangular matrix T is returned in the optional argument “tri=T(:,:)”.
The eigenvalues of A are the diagonal entries of the matrix T. They are in no particular order. The output array E(:)is blocked with NaNs using this option. This option requires the simultaneous use of
option ?_lin_eig_no_balance.
Default: The matrix is reduced to diagonal form.
iopt(IO) = ?_options(?_lin_eig_gen_continue_with_V, ?_dummy)
As a convenience or for maintaining efficiency, the calling program unit sets the optional argument
Ú . The contents of
“v=V(:,:)” to a matrix that has transformed a problem to the similar matrix, $
V(:,:) are updated by the transformations used in the algorithm. Requires the simultaneous use of
option ?_lin_eig_no_balance.
Default: The array V(:,:) is initialized to the identity matrix.
iopt(IO) = ?_options(?_lin_eig_gen_no_sorting, ?_dummy)
Does not sort the eigenvalues as they are isolated by solving the 2 × 2 or unit sized blocks. This will
have the effect of guaranteeing that complex conjugate pairs of eigenvalues are adjacent in the array
E(:).
Default: The entries of E(:) are sorted so they are non-increasing in absolute value.
FORTRAN 90 Interface
Generic:
CALL LIN_EIG_GEN (A, E [,…])
Specific:
The specific interface names are S_LIN_EIG_GEN, D_LIN_EIG_GEN, C_LIN_EIG_GEN,
and Z_LIN_EIG_GEN.
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
590
Description
The input matrix A is first balanced. The resulting similar matrix is transformed to upper Hessenberg form
using orthogonal transformations. The double-shifted QR algorithm transforms the Hessenberg matrix so
that 2 × 2 or unit sized blocks remain along the main diagonal. Any off-diagonal that is classified as “small”
in order to achieve this block form is set to the value zero. Next the block upper triangular matrix is transformed to upper triangular form with unitary rotations. The eigenvectors of the upper triangular matrix are
computed using back substitution. Care is taken to avoid overflows during this process. At the end, eigenvectors are normalized to have Euclidean length one, with the largest component real and positive. This
algorithm follows that given in Golub and Van Loan, (1989, Chapter 7), with some novel organizational
details for additional options, efficiency and robustness.
Fatal, Terminal, and Warning Error Messages
See the messages.gls file for error messages for LIN_EIG_GEN. These error messages are numbered 841–858;
861–878; 881–898; 901–918.
Examples
Example 1: Computing Eigenvalues
The eigenvalues of a random real matrix are computed. These values define a complex diagonal matrix E.
Their correctness is checked by obtaining the eigenvector matrix V and verifying that the residuals
R = AV - VE are small. Also, see operator_ex29, supplied with the product examples.
use lin_eig_gen_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_EIG_GEN.
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) A(n,n), y(n*n), err
complex(kind(1d0)) E(n), V(n,n), E_T(n)
type(d_error) :: d_epack(16) = d_error(0,0d0)
! Generate a random matrix.
call rand_gen(y)
A = reshape(y,(/n,n/))
! Compute only the eigenvalues.
call lin_eig_gen(A, E)
! Compute the decomposition, A*V = V*values,
! obtaining eigenvectors.
call lin_eig_gen(A, E_T, v=V)
! Use values from the first decomposition, vectors from the
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
591
! second decomposition, and check for small residuals.
err = sum(abs(matmul(A,V) - V*spread(E,DIM=1,NCOPIES=n))) &
/ sum(abs(E))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_EIG_GEN is correct.'
end if
end
Output
Example 1 for LIN_EIG_GEN is correct.
Example 2: Complex Polynomial Equation Roots
The roots of a complex polynomial equation,
Q
I ] Ł
™E ]
N
QíN
]Q
N are required. This algebraic equation is formulated as a matrix eigenvalue problem. The equivalent matrix
eigenvalue problem is solved using the upper Hessenberg matrix which has the value zero except in row
number 1 and along the first subdiagonal. The entries in the first row are given by
a1,j = – bj, i = 1, …, n, while those on the first subdiagonal have the value one. This is a companion matrix for
the polynomial. The results are checked by testing for small values of ∣f(ei)∣, i = 1, …, n, at the eigenvalues of
the matrix, which are the roots of f(z). Also, see operator_ex30, supplied with the product examples.
use lin_eig_gen_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_EIG_GEN.
integer i
integer, parameter :: n=12
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) err, t(2*n)
type(d_options) :: iopti(1)=d_options(0,zero)
complex(kind(1d0)) a(n,n), b(n), e(n), f(n), fg(n)
call rand_gen(t)
b = cmplx(t(1:n),t(n+1:),kind(one))
! Define the companion matrix with polynomial coefficients
! in the first row.
a = zero
do i=2, n
a(i,i-1) = one
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
592
end do
a(1,1:n) = -b
! Note that the input companion matrix is upper Hessenberg.
iopti(1) = d_options(z_lin_eig_gen_in_Hess_form,zero)
! Compute complex eigenvalues of the companion matrix.
call lin_eig_gen(a, e, iopt=iopti)
f=one; fg=one
! Use Horner's method for evaluation of the complex polynomial
! and size gauge at all roots.
do i=1, n
f = f*e + b(i)
fg = fg*abs(e) + abs(b(i))
end do
! Check for small errors at all roots.
err = sum(abs(f/fg))/n
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_EIG_GEN is correct.'
end if
end
Output
Example 2 for LIN_EIG_GEN is correct.
Example 3: Solving Parametric Linear Systems with a Scalar Change
The efficient solution of a family of linear algebraic equations is required. These systems are (A + hI)x = b.
Here A is an n × n real matrix, I is the identity matrix, and b is the right-hand side matrix. The scalar h is such
that the coefficient matrix is nonsingular. The method is based on the Schur form for matrix A: AW = WT,
where W is unitary and T is upper triangular. This provides an efficient solution method for several values of
h, once the Schur form is computed. The solution steps solve, for y, the upper triangular linear system
7 K, \
ņ7
: E
Then, x = x(h) = Wy. This is an efficient and accurate method for such parametric systems provided the
expense of computing the Schur form has a pay-off in later efficiency. Using the Schur form in this way, it is
not required to compute an LU factorization of A + hI with each new value of h. Note that even if the data A,
h, and b are real, subexpressions for the solution may involve complex intermediate values, with x(h) finally a
real quantity. Also, see operator_ex31, supplied with the product examples.
use lin_eig_gen_int
use lin_sol_gen_int
use rand_gen_int
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
593
implicit none
! This is Example 3 for LIN_EIG_GEN.
integer i
integer, parameter :: n=32, k=2
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1e0)) a(n,n), b(n,k), x(n,k), temp(n*max(n,k)), h, err
type(s_options) :: iopti(2)
complex(kind(1e0)) w(n,n), t(n,n), e(n), z(n,k)
call rand_gen(temp)
a = reshape(temp,(/n,n/))
call rand_gen(temp)
b = reshape(temp,(/n,k/))
iopti(1) = s_options(s_lin_eig_gen_out_tri_form,zero)
iopti(2) = s_options(s_lin_eig_gen_no_balance,zero)
! Compute the Schur decomposition of the matrix.
call lin_eig_gen(a, e, v=w, tri=t, &
iopt=iopti)
! Choose a value so that A+h*I is non-singular.
h = one
! Solve for (A+h*I)x=b using the Schur decomposition.
z = matmul(conjg(transpose(w)),b)
! Solve intermediate upper-triangular system with implicit
! additive diagonal, h*I. This is the only dependence on
! h in the solution process.
do i=n,1,-1
z(i,1:k) = z(i,1:k)/(t(i,i)+h)
z(1:i-1,1:k) = z(1:i-1,1:k) + &
spread(-t(1:i-1,i),dim=2,ncopies=k)* &
spread(z(i,1:k),dim=1,ncopies=i-1)
end do
! Compute the solution. It should be the same as x, but will not be
! exact due to rounding errors. (The quantity real(z,kind(one)) is
! the real-valued answer when the Schur decomposition method is used.)
z = matmul(w,z)
! Compute the solution by solving for x directly.
do i=1, n
a(i,i) = a(i,i) + h
end do
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
594
call lin_sol_gen(a, b, x)
! Check that x and z agree approximately.
err = sum(abs(x-z))/sum(abs(x))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_EIG_GEN is correct.'
end if
end
Output
Example 3 for LIN_EIG_GEN is correct.
Example 4: Accuracy Estimates of Eigenvalues Using Adjoint and Ordinary Eigenvectors
A matrix A has entries that are subject to uncertainty. This is expressed as the realization that A can be
replaced by the matrix A + ηB, where the value η is “small” but still significantly larger than machine precision. The matrix B satisfies ∥B∥ ≤ ∥A∥. A variation in eigenvalues is estimated using analysis found in Golub
and Van Loan, (1989, Chapter 7, p. 344). Each eigenvalue and eigenvector is expanded in a power series in η.
With
HL Ș § HL ȘHLȘ
and normalized eigenvectors, the bound
ӛHLӛ ”
ӝ $ӝ
ӛXL YLӛ
is satisfied. The vectors ui and vi are the ordinary and adjoint eigenvectors associated respectively with ei and
its complex conjugate. This gives an upper bound on the size of the change to each ∣ei∣ due to changing the
matrix data. The reciprocal
í
ӛXL YLӛ
is defined as the condition number of ei. Also, see operator_ex32, Chapter 10.
use lin_eig_gen_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_EIG_GEN.
integer i
integer, parameter :: n=17
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
595
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) a(n,n), c(n,n), variation(n), y(n*n), temp(n), &
norm_of_a, eta
complex(kind(1d0)), dimension(n,n) :: e(n), d(n), u, v
! Generate a random matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
! Compute the eigenvalues, left- and right- eigenvectors.
call lin_eig_gen(a, e, v=v, v_adj=u)
! Compute condition numbers and variations of eigenvalues.
norm_of_a = sqrt(sum(a**2)/n)
do i=1, n
variation(i) = norm_of_a/abs(dot_product(u(1:n,i), &
v(1:n,i)))
end do
!
!
!
!
Now perturb the data in the matrix by the relative factors
eta=sqrt(epsilon) and solve for values again. Check the
differences compared to the estimates. They should not exceed
the bounds.
eta = sqrt(epsilon(one))
do i=1, n
call rand_gen(temp)
c(1:n,i) = a(1:n,i) + (2*temp - 1)*eta*a(1:n,i)
end do
call lin_eig_gen(c,d)
! Looking at the differences of absolute values accounts for
! switching signs on the imaginary parts.
if (count(abs(d)-abs(e) > eta*variation) == 0) then
write (*,*) 'Example 4 for LIN_EIG_GEN is correct.'
end if
end
Output
Example 4 for LIN_EIG_GEN is correct.
LIN_EIG_GEN
Chapter 2: Eigensystem Analysis
596
LIN_GEIG_GEN
more...
Computes the generalized eigenvalues of an n × n matrix pencil, Av = λBv. Optionally, the generalized eigenvectors are computed. If either of A or B is nonsingular, there are diagonal matrices α and β, and a complex
matrix V, all computed such that AV β = BVα.
Required Arguments
A — Array of size n × n containing the matrix A. (Input [/Output])
B — Array of size n × n containing the matrix B. (Input [/Output])
ALPHA — Array of size n cont