benchmarks
comparison of different eigensolvers
Here we show results comparing different eigensolver strategies for five systems, whose properties are indicated in the table below. The calculations were performed on an IBM SP 5 with 8 processors per node.
system | number
of atoms |
number
of plane waves |
E_ref (eV) |
E_cut (Ryd) |
N1xN2xN3 | number
of eigestates |
number
of processors |
20Cd19Se | 39 | 11,331 | -4.8 | 6.88 | 39x79x39 | 10 | 8 |
83Cd81Se | 164 | 34,143 | -4.8 | 6.88 | 164x276x164 | 10 | 16 |
232Cd235Se | 467 | 75,645 | -4.8 | 6.88 | 467x697x467 | 10 | 16 |
534Cd527Se | 1071 | 141,625 | -3.8 (CBM) -4.8 (VBM) |
6.88 | 1061x1467x1061 | 10 | 32 |
dot5 | 1327 | 2,717,000 | +0.6 (CBM) -0.4 (VBM) |
35.0 | - | 6 | 64 |
Qwire | 66,624 | 2,266,000 | -5.1 (CBM) -5.4 (VBM) |
5.0 | - | 5 | 64 |
General information and remarks:
- In the following tables:
- PCG is the Preconditioned Conjugate Gradient algorithm, LOBPCG is (our implementation of) the Locally Optimal Block Preconditioned Conjugate Gradient algorithm, PARPACK is the (parallel) Implicitly Restarted Arnoldi (or Lanczos, IRL) algorithm, and PRIMME is the PReconditioned Iterative MultiMethod Eigensolver(a).
- nline is the number of line mimizations per iteration in the CG algorithm, basis size is the maximum basis size allowed in each iteration (for PARPACK and PRIMME only), restart size is the minimum basis size for restarting (for PRIMME only), matvecs is the total number of matrix vector multiplications performed, and time is the wall clock time required by the eigensolver phase of ESCAN only.
- Two of the eigensolvers, namely PARPACK and PRIMME, can also be applied to the unfolded spectrum and we include some results for comparison pursposes.
- We underline the minimum matvecs and time for the algorithm that succesfully returns the eigenstates with the required tolerance when using the the folded spectrum approach. We use italics for the minimum matvecs and time when the algorithms are applied to the unfolded spectrum
- A pair (E,w) is declared an eigenpair of H when r(E,w)=|| Hw-E*w || <= tol. However, this criterion raises a potential issue when using the folded spectrum approach.While in practice one cares about the (final) residual r(E,w) being small with respect to H, in a ÃÂblack box software the convergence criterion may be actually based on the operator (H-E_ref*I)2. This is the case of PRIMME, which means that PRIMME may have to do some more work for some systems, i.e. it may require a tighter tol (see Qwire for example)(a,b). On the other hand, If there is a good guess for the norm of H then it can be used for a more better setting of tol.
- Unless otherwise noted, tol=1.0E-6 in the tables below.
(a) See State-of-the-art Eigensolvers for Electronic Structure Calculations of Large Scale Nano-systems for more information and references therein.Conclusions:
(b) A premature acceptance of not yet fully converged pairs (E,w) may also lead to numerical problems. This is particularly important for PRIMME, since it implements algorithms that are different in nature from PCG and LOBPCG.
- PCG and LOBPCG are reliable for finding the states closest to the band gap. However, they may "stagnate", i.e.they may fail to make approximate eigenpairs converge to the desired accuracy (see dot5 for example).
- IRL converges slowly for most folded spectrum computations. In unfolded computations with the original operator, IRL sometimes shows mis-convergence (that is, convergence to the wrong eigenvalues, resulting in relevant eigenpairs being missed).
- The GD+k (Olsen algorithm) method from PRIMME (that is option PRIMME MIN_MATVECS) is reliable for both FS and unfolded computations. However, more research has to be done in order to improve its performance for unfolded computations.
Based on the systems we have studied and the number os eigenstates we have computed, we recommend the default PRIMME MIN_MATVECS, with the restart size close to the number of required eigestates, and a basis size equal to 2 or 3 times the restart size. Also, for large systems, it is important to set tol to a smaller value that would be normally set for PCG or LOBPCG.
20Cd19Se (atomic coordinates, pseudopotential for Cd, pseudopotential for Se)
a) Folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 100 | - | - | 4956 | 9.4 |
LOBPCG(1) | - | - | - | 4756 | 19.3 |
PARPACK(2) | - | 20 | - | 14630 | 27.2 |
PARPACK(3) | - | 25 | - | 9712 | 18.1 |
PARPACK(3) | - | 30 | - | 7474 | 14.1 |
PARPACK(3) | - | 35 | - | 5838 | 11.1 |
PRIMME MIN_MATVECS(1) | - | 16 | 8 | 1750 | 3.9 |
PRIMME MIN_TIME(1) | - | 16 | 8 | 4720 | 8.0 |
b) Unfolded spectrum:
algorithm | basis size | restart size | matvecs | time (s) |
PARPACK(4) | 20 | - | 10020 | 20.6 |
PARPACK(5) | 25 | - | 1326 | 2.9 |
PARPACK(5) | 30 | - | 1310 | 2.9 |
PARPACK(5) | 35 | - | 1293 | 2.9 |
PRIMME MIN_MATVECS(1) | 16 | 8 | 4185 | 10.0 |
PRIMME MIN_TIME(1) | 16 | 8 | 3350 | 7.0 |
(1) Converges to the eigenstates -6.19176, -6.19176, -6.34729, -6.38668, -6.43238, -6.43238, -6.60944, -6.60945, -6.71546 and -6.71546.
(2) Converges to only 8 eigenstates if the maximum number of iterations (restarts) is set to 1000: -6.43238, -6.60945, -6.60944, -6.71546, -6.71546, -6.88809, -6.91577 and -6.98363.
(3) Converges to the eigenstates -6.43238, -6.43238, -6.60944, -6.60945, -6.71546, -6.71546, -6.88809, -6.91577, -6.98363 and -7.08253.
(4) Does not converge to any eigenstate with the maximum number of iterations (restarts) set to 1000.
(5) Converges to the eigenstates -6.19176, -6.34729, -6.38668, -6.43238, -6.43238, -6.60944, -6.60945, -6.71546, -6.71546 and -6.88809.
83Cd81Se (atomic coordinates, pseudopotential for Cd, pseudopotential for Se)
a) Folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 100 | - | - | 17920 | 65.6 |
PCG(1) | 200 | - | - | 15096 | 52.7 |
LOBPCG(1) | - | - | - | 10688 | 69.9 |
PARPACK(2) | - | 50 | - | 24252 | 86.7 |
PARPACK(3) | - | 100 | - | 15126 | 60.3 |
PRIMME MIN_MATVECS(1) | - | 30 | 10 | 3670 | 12.7 |
PRIMME MIN_TIME(1) | - | 30 | 10 | 11808 | 36.7 |
b) Unfolded spectrum:
algorithm | basis size | restart size | matvecs | time (s) |
PARPACK(4) | 50 | - | 4073 | 16.7 |
PARPACK(3) | 100 | - | 3273 | 15.7 |
PRIMME MIN_MATVECS(1) | 30 | 10 | 5077 | 23.2 |
PRIMME MIN_TIME(1) | 30 | 10 | 5059 | 19.6 |
(1) Converges to the eigenstates -5.72654, -5.72654, -5.78686, -5.83003, -5.83003, -5.85207, -5.98438, -6.01278, -6.01278 and -6.02422.
(2) Converges to the eigenstates -5.83003, -5.83003, -5.85207, -5.98438, -6.01278, -6.01278, -6.02422, -6.02751, -6.02751 and -6.11332.
(3) Converges to the eigenstates -5.83003, -5.83003, -5.85207, -5.98438, -6.01278, -6.01278, -6.02422, -6.02422, -6.02751 and -6.02751.
(4) Converges to the eigenstates -5.83003, -5.98438, -6.01278, -6.01278, -6.02422, -6.02422, -6.02751, -6.02751, -6.07613 and -6.11332.
232Cd235Se (atomic coordinates, pseudopotential for Cd, pseudopotential for Se)
a) Folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 200 | - | - | 15754 | 106.4 |
LOBPCG(1) | - | - | - | 11864 | 121.4 |
PARPACK(2) | - | 30 | - | 20060 | 133.0 |
PRIMME MIN_MATVECS(1) | - | 16 | 8 | 3742 | 25.0 |
PRIMME MIN_TIME(1) | - | 16 | 8 | 11708 | 73.4 |
b) Unfolded spectrum:
algorithm | basis size | restart size | matvecs | time (s) |
PARPACK(1) | 30 | - | 6205 | 47.6 |
PRIMME MIN_MATVECS(1) | 16 | 8 | 11661 | 94.0 |
PRIMME MIN_TIME(1) | 16 | 8 | 8736 | 61.6 |
(1) Converges to the eigenstates -5.51570, -5.51570, -5.53926, -5.58286, -5.58286, -5.60869, -5.67889, -5.69688, -5.69688 and -5.71672.
(2) Does not converge to any eigenstate with the maximum number of iterations (restarts) set to 500.
534Cd527Se (atomic coordinates, pseudopotential for Cd, pseudopotential for Se)
a) CBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 200 | - | - | 7066 | 62.5 |
LOBPCG(1) | - | - | - | 6880 | 92.4 |
PARPACK(2) | - | 30 | - | 9362 | 86.1 |
PRIMME MIN_MATVECS(1) | - | 16 | 8 | 2176 | 21.3 |
PRIMME MIN_TIME(1) | - | 16 | 8 | 6212 | 52.3 |
b) CBM, unfolded spectrum:
algorithm | basis size | restart size | matvecs | time (s) |
PARPACK(1) | 30 | - | 2434 | 24.4 |
PRIMME MIN_MATVECS(1) | 16 | 8 | 12813 | 143.7 |
PRIMME MIN_TIME(1) | 16 | 8 | 6383 | 59.2 |
c) VBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(3) | 200 | - | - | 23810 | 228.0 |
LOBPCG(3) | - | - | - | 16862 | 254.7 |
PARPACK(4) | - | 30 | - | 20060 | 190.9 |
PRIMME MIN_MATVECS(3) | - | 16 | 8 | 4762 | 46.0 |
PRIMME MIN_TIME(3) | - | 16 | 8 | 11259 | 109.1 |
d) VBM, unfolded spectrum:
algorithm | basis size | restart size | matvecs | time (s) |
PARPACK(5) | 30 | - | 7450 | 73.2 |
PRIMME MIN_MATVECS(3) | 16 | 8 | 15449 | 172.7 |
PRIMME MIN_TIME(3) | 16 | 8 | 11259 | 98.8 |
(1) Converges to the eigenstates -3.10118, -2.83770, -2.81099, -2.81099, -2.56043, -2.52280, -2.52280, -2.51159, -2.51159 and -2.37371.
(2) Converges to the eigenstates -2.81099, -2.81099, -2.56043, -2.52280, -2.52280, -2.51159, -2.51159, -2.37371, -2.28992 and -2.23882.
(3) Converges to the eigenstates -5.39076, -5.39076, -5.40313, -5.44361, -5.44361, -5.48316, -5.49335, -5.51804, -5.51804 and -5.52054.
(4) Does not converge to any eigenstate with the maximum number of iterations set to 500.
(5) Converges to the eigenstates -5.39076, -5.39076, -5.44361, -5.44361, -5.48316, -5.49335, -5.51804, -5.51804, -5.52054 and -5.52054.
dot5 (atomic coordinates, pseudopotential for Cd, pseudopotential for Se)
a) CBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 100 | - | - | 220212 | 36726.8 |
PCG(2) | 200 | - | - | 335966 | 53670.2 |
LOBPCG(3) | - | - | - | 106518 | 20996.1 |
LOBPCG(4) | - | - | - | 148486 | 28379.9 |
PRIMME MIN_MATVECS(5) | - | 16 | 8 | 4788 | 823.1 |
PRIMME MIN_MATVECS(6) | - | 16 | 8 | 62334 | 10211.3 |
PRIMME MIN_TIME(7) | - | 16 | 8 | 271492 | 43242.0 |
b) VBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(8) | 100 | - | - | 197712 | 31399.3 |
PCG(8) | 200 | - | - | 101904 | 15670.7 |
LOBPCG(9) | - | - | - | 120030 | 20779.0 |
LOBPCG(9) | - | - | - | 240030 | 41399.9 |
PRIMME MIN_MATVECS(10) | - | 16 | 8 | 11644 | 1926.7 |
PRIMME MIN_MATVECS(11) | - | 16 | 8 | 42910 | 7013.3 |
PRIMME MIN_MATVECS(12) | - | 16 | 8 | 54362 | 8757.7 |
PRIMME MIN_TIME(12) | - | 16 | 8 | 254810 | 39112.0 |
(1) Converges to 1.35724, 1.64617, 1.64617, 1.64617, 1.91451 and 1.92275 with the maximum number of allowed iterations set to 500. For two states r(E,w)~1.0E-3.
(2) Converges to 1.35724, 1.64617, 1.64617, 1.64617, 1.92353 and 1.92354 with the maximum number of allowed iterations set to 500. For two states r(E,w)~1.0E-3.
(3) Similar to (2) with the maximum number of allowed iterations set to 10000. For two states r(E,w)~1.0E-2.
(4) Similar to (2) with the maximum number of allowed iterations set to 20000. For two states r(E,w)~1.0E-2.
(5) Converges to 1.13394, 1.23783, 1.35674, 1.64388, 1.79981 and 1.85452. For two states r(E,w)~1.0E-2.
(6) Converges to 1.35724, 1.64617, 1.64617, 1.64617, 1.92364 and 1.92364 with tol=1.0E-10.
(7) Converges to 1.35724, 1.64617, 1.64617, 1.64617, 1.91777 and 1.92363 with tol=1.0E-10 but for two states r(E,w)~1.0E-3.
(8) Converges to -0.72398, -0.72398, -0.72398, -0.72946, -0.72946 and -0.72946.
(9) Similar to (8) with the maximum number of allowed iterations set to 10000. For six states r(E,w)~1.0E-3.
(10) Converges to -0.72400, -0.72948, -0.77577, -0.78304, -0.81558 and -0.84278. For all states r(E,w)~1.0E-4.
(11) Converges to -0.72398, -0.72398, -0.72398, -0.72946, -0.72946 and -0.72946 with tol=1.0E-9.
(12) Similar to (11); tol=1.0E-10.
Qwire
a) CBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(1) | 100 | - | - | 21931 | 1072.0 |
LOBPCG(1) | - | - | - | 20337 | 1376.8 |
PRIMME MIN_MATVECS(2) | - | 16 | 8 | 5438 | 292.0 |
PRIMME MIN_MATVECS(3) | - | 16 | 8 | 8504 | 418.0 |
PRIMME MIN_TIME(2) | - | 16 | 8 | 16490 | 757.1 |
PRIMME MIN_TIME(3) | - | 16 | 8 | 28076 | 1392.0 |
b) VBM, folded spectrum:
algorithm | nline | basis size | restart size | matvecs | time (s) |
PCG(4) | 100 | - | - | 307270 | 15546.8 |
PCG(4) | 200 | - | - | 149726 | 7278.1 |
LOBPCG(5) | - | - | - | 56207 | 3690.1 |
PRIMME MIN_MATVECS(6) | - | 16 | 8 | 12002 | 606.1 |
PRIMME MIN_MATVECS(6) | - | 20 | 10 | 11596 | 1502.6 |
PRIMME MIN_MATVECS(6) | - | 30 | 10 | 12670 | 2571.6 |
PRIMME MIN_MATVECS(7) | - | 20 | 10 | 24068 | 1240.6 |
PRIMME MIN_MATVECS(7) | - | 30 | 10 | 26326 | 1424.2 |
PRIMME MIN_TIME(6) | - | 16 | 8 | 36310 | 1682.6 |
(1) Converges to -4.89017, -4.71187, -4.68034, -4.68034 and -4.55008.
(2) Similar to (1) but for all states r(E,w)~1.0E-5.
(3) Similar to (1); tol=1.0E-8.
(4) Converges to -5.73241, -5.73241, -5.73423, -5.74245 and -5.74360 with the maximum number of allowed iterations set to 500. For one state r(E,w)~1.0E-5.
(5) Similar to (4) but for all states r(E,w) < tol.
(6) Converges to -5.73241, -5.73423, -5.74249, -5.74365 and -5.75513 but for all states r(E,w)~1.0E-5.
(7) Similar to (4); tol=1.0E-8.
back to main page