Convergence problem on CRAY-XT4

Jun chen... at googlemail.com
Fri Aug 1 09:46:04 UTC 2008


Hi Teo,
Sorry for the late reply.
I am compiling cp2k on UK national supercomuting service. Many users
are running jobs on it, so it is unlikely that there are some
corrupted libraries.
The PGI executable was compiled with PGI 7.1.4. It seems that my
colleague found that PGI 7.2.x is buggy. Since the convergence problem
happens to both PGI and PathScale executables, it may not be the
problem of compilers.
As you suggested, I run a testing job of 32 water on xt4 and other
machines. Surprisingly, convergence is not a problem on xt4 in this
case, and their results are almost the same. It looks as the
convergence problem is system dependent. (The system I am working on
is just a small organic molecule in water box)
One possibility is that on xt4  the parts of source code of
wavefunction optimization are miscompiled for some reason. Is it
possible?

Cheers,
Jun

On Jul 31, 11:23 am, Teodoro Laino <teodor... at gmail.com> wrote:
> Dear Jun,
>
> With the few information you've posted I would consider the
> possibility that you link corrupted libraries.
> Having an executable does not necessarily mean that it's gonna work!
>
> You could double-check (for example) building a PGI executable (care
> is required in picking the right
> PGI compiler version!) on the XT4.
> Or additionally run the input H2O.inp for 100-1000 MD steps and check
> that the runs (on XT4-pathscale and
> another machine of your choice) do not diverge! Within numerical
> accuracy they should be similar.
>
> Teo
>
> On 31 Jul 2008, at 12:12, Jun wrote:
>
>
>
> > Hello,
>
> > I recently compiled cp2k on CRAY-XT4 by using both PGI and PathScale.
> > I noticed that the wavefunction optimization is very difficult to
> > converge for both excutables. The most strange is that given the SAME
> > input file my system is very easy to reach convergence on other
> > machines.
> > I would like to post my input file, but it looks like the problem only
> > takes place on CRAY-XT4 in my case. So you may not reproduce it, and I
> > am only posting the part of my input file which, I think, is relevant
> > to wavefunction optimization and the part of my output file where
> > optimization went wrong as follows:
>
> > #########INPUT################################
> >     &QS
> >       EPS_DEFAULT 1.0E-12
> >       EXTRAPOLATION PS
> >       EXTRAPOLATION_ORDER (2~4)   ## the values in parenthesis are
> > given which I tried.
> >       MAP_CONSISTENT (T or F)
> >     &END QS
> >     &SCF
> >       SCF_GUESS RESTART
> >       EPS_SCF 1.0E-6
> >       MAX_SCF (50 or 100)
> >       &OUTER_SCF
> >          EPS_SCF 1.0E-6
> >          MAX_SCF (3 or 10)
> >       &END
> >       &OT
> >          MINIMIZER DIIS
> >          PRECONDITIONER FULL_ALL
> >          ENERGY_GAP 0.001
> >       &END
>
> > ###########OUTPUT#########################################
>
> >  SCF WAVEFUNCTION OPTIMIZATION
>
> >   Step  Update method              Time         Convergence
> > Total energy
>
> > ----------------------------------------------------------------------
> > -------
>
> >   ----------------------------------- OT
> > --------------------------------------
>
> >   Allowing for rotations:  F
> >   minimizer      : DIIS                : direct inversion
> >                                          in the iterative subspace
> >                             using      : -   7 diis vectors
> >                                          - safer DIIS on
> >   preconditioner : FULL_ALL            : diagonalization, state
> > selective
> >   precond_solver : DEFAULT
> >   stepsize       :    0.15000000
> >   energy_gap     :    0.00100000
> >   eps_taylor     :   0.10000E-15
> >   max_taylor     :             4
>
> >   ----------------------------------- OT
> > --------------------------------------
> >      1  OT DIIS        0.15E+00    3.40        0.0004048620
> > -550.5990644512
> >      2  OT DIIS        0.15E+00    1.68        0.0002257035
> > -550.6014079478
> >      3  OT DIIS        0.15E+00    1.68        0.0001567544
> > -550.6018086605
> >      4  OT DIIS        0.15E+00    1.69        0.0001478501
> > -550.6019282717
> >      5  OT DIIS        0.15E+00    1.68        0.0000140029
> > -550.6020148403
> >      6  OT DIIS        0.15E+00    1.68        0.0000075576
> > -550.6020158434
> >      7  OT DIIS        0.15E+00    1.68        0.0000015240
> > -550.6020162166
> >      8  OT DIIS        0.15E+00    1.68        0.0000009637
> > -550.6020162310
>
> >   *** SCF run converged in     8 steps ***
>
> >  SCF WAVEFUNCTION OPTIMIZATION
>
> >   Step  Update method              Time         Convergence
> > Total energy
>
> > ----------------------------------------------------------------------
> > -------
>
> >   ----------------------------------- OT
> > --------------------------------------
>
> >   Allowing for rotations:  F
> >   minimizer      : DIIS                : direct inversion
> >                                          in the iterative subspace
> >                             using      : -   7 diis vectors
> >                                          - safer DIIS on
> >   preconditioner : FULL_ALL            : diagonalization, state
> > selective
> >   precond_solver : DEFAULT
> >   stepsize       :    0.15000000
> >   energy_gap     :    0.00100000
> >   eps_taylor     :   0.10000E-15
> >   max_taylor     :             4
>
> >   ----------------------------------- OT
> > --------------------------------------
> >      1  OT DIIS        0.15E+00    3.46        0.0003244345
> > -550.5979524631
> >      2  OT DIIS        0.15E+00    1.72        0.3582792984
> > -549.0562074774
> >      3  OT DIIS        0.15E+00    1.67        0.0666815909
> > -550.5510068528
> >      4  OT DIIS        0.15E+00    1.68        0.0187452394
> > -550.5922361883
> >      5  OT DIIS        0.15E+00    1.68        0.0150188254
> > -550.5956444599
> >      6  OT DIIS        0.15E+00    1.68        0.0128561075
> > -550.5977731312
> >      7  OT DIIS        0.15E+00    1.68        0.0040306865
> > -550.5994603925
> >      8  OT DIIS        0.15E+00    1.73        0.0048897159
> > -550.5994262759
> >      9  OT DIIS        0.15E+00    1.72        0.5803940138
> > -543.6958955076
> >     10  OT DIIS        0.15E+00    1.71        0.4790483717
> > -546.8306205842
> >     11  OT DIIS        0.15E+00    1.71        0.4484756049
> > -546.9703801129
> >     12  OT DIIS        0.15E+00    1.72        0.2892482006
> > -549.4963848215
> >     13  OT DIIS        0.15E+00    1.72        0.3169120672
> > -549.2594466122
> >     14  OT DIIS        0.15E+00    1.72        0.5450061041
> > -544.1285722851
> >     15  OT DIIS        0.15E+00    1.71        0.6151522120
> > -536.7805473515
> >     16  OT DIIS        0.15E+00    1.71        0.5325127470
> > -526.5036282220
> >     17  OT SD          0.15E+00    1.72        0.7054506576
> > -524.6187705129
> >     18  OT SD          0.15E+00    1.72        0.5975827285
> > -527.1046228683
> >     19  OT DIIS        0.15E+00    1.72        0.4534629356
> > -512.6076934035
> >     20  OT SD          0.15E+00    1.72        0.7051716877
> > -517.2153320735
> >     21  OT DIIS        0.15E+00    1.72        0.5978949130
> > -518.9935225798
> >     22  OT DIIS        0.15E+00    1.72        0.7366880553
> > -523.9580140056
> >     23  OT DIIS        0.15E+00    1.71        0.3665273981
> > -522.5952663063
> >     24  OT DIIS        0.15E+00    1.72        0.6103029589
> > -525.7386134805
> >     25  OT DIIS        0.15E+00    1.72        0.5928273741
> > -518.3433269985
> >     26  OT SD          0.15E+00    1.72        0.7915683684
> > -518.6178872028
> >     27  OT DIIS        0.15E+00    1.71        0.7345116014
> > -512.5592906776
> >     28  OT SD          0.15E+00    1.72        0.5757627653
> > -524.7933568324
> >     29  OT DIIS        0.15E+00    1.72        0.5340765779
> > -510.3937429997
> >     30  OT DIIS        0.15E+00    1.71        0.3588817726
> > -518.3632287287
> >     31  OT DIIS        0.15E+00    1.72        0.5929348962
> > -517.3686220959
> >     32  OT SD          0.15E+00    1.72        0.6086867431
> > -527.8008731379
> >     33  OT DIIS        0.15E+00    1.72        0.6442183785
> > -519.1398893073
> >     34  OT DIIS        0.15E+00    1.72        0.3082463661
> > -520.3827365013
> >     35  OT DIIS        0.15E+00    1.71        0.3314197627
> > -529.5746297456
> >     36  OT SD          0.15E+00    1.72        0.3885674426
> > -526.7342639127
> >     37  OT DIIS        0.15E+00    1.72        0.4995344780
> > -525.5986233130
> >     38  OT DIIS        0.15E+00    1.72        0.6351236115
> > -525.2137191379
> >     39  OT DIIS        0.15E+00    1.71        0.5600541470
> > -521.4553860108
> >     40  OT SD          0.15E+00    1.72        0.4898312593
> > -514.6756173144
> >     41  OT SD          0.15E+00    1.72        0.3839304985
> > -519.7044911599
> >     42  OT SD          0.15E+00    1.73        0.6481227248
> > -517.4644351710
> >     43  OT DIIS        0.15E+00    1.71        0.6886092282
> > -520.6408127027
> >     44  OT DIIS        0.15E+00    1.72        0.6312517577
> > -524.8666830876
> >     45  OT DIIS        0.15E+00    1.72        0.4446104788
> > -530.0778790639
> >     46  OT DIIS        0.15E+00    1.72        0.6867056507
> > -521.6599553796
> >     47  OT SD          0.15E+00    1.72        0.6491170077
> > -526.4536650096
> >     48  OT DIIS        0.15E+00    1.72        0.5138815903
> > -520.7176635427
> >     49  OT DIIS        0.15E+00    1.71        0.3587928611
> > -525.4360111551
> >     50  OT DIIS        0.15E+00    1.72        0.5940637552
> > -532.4076679538
>
> >   *** SCF run NOT converged ***
> > #######################END#########################
>
> > I am new to cp2k, and have no clue what is going on there.
> > Any comment will be appreciated.
>
> > Cheers,
> > Jun


More information about the CP2K-user mailing list