[CP2K:1687] NaN problem during wavefunction optimization

Teodoro Laino teodor... at gmail.com
Tue Jan 6 11:30:57 UTC 2009


Dear Jun,
The fact that skipping the restart file fixes the problem does not mean 
too much.
I may depend on several things:
(0) Bug in the code
(1) Numerical Instabilities in your library
(2) Corrupted restart file
(3) Change of basis set between the restart file and the the new input file.

Moreover your input file is not friendly for debugging (takes too much 
time!).
Can you please work with a small system and reproduce the error in such 
a way that we don't spend the
whole 2009 just looking for flies in the code?

Thanks,
Teo

Jun wrote:
> Dear all,
>
> Wish you had a happy Christmas and new year holiday.Reluctant to go
> back to work? Let me warm you up.
> I am doing ENERGY_FORCE calculations of a series of snapshots taken
> from a MD trajectory (one snapshot in every 20 MD steps). I used a
> script to do the loop. The running calculation uses the wavefunction
> restart file generated by the last one to save some time. From time to
> time, calculations go wrong, and part of relevant output is as
> follows:
>
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>  SCF WAVEFUNCTION OPTIMIZATION
>
>   Step  Update method              Time         Convergence
> Total energy
>  
> -----------------------------------------------------------------------------
>
>   ----------------------------------- OT
> --------------------------------------
>
>   Allowing for rotations:  F
>   Optimizing orbital energies:  F
>   Minimizer      : DIIS                : direct inversion
>                                          in the iterative subspace
>                             using      : -   7 DIIS vectors
>                                          - safer DIIS on
>   Preconditioner : FULL_SINGLE_INVERSE : cholesky inversion of H + eS
>   Precond_solver : DEFAULT
>   stepsize       :    0.15000000
>   energy_gap     :    0.20000000
>   eps_taylor     :   0.10000E-15
>   max_taylor     :             4
>
>   ----------------------------------- OT
> --------------------------------------
>      1  OT DIIS        0.15E+00    9.53        0.0005010983
> -5564.3429148492
>      2  OT DIIS        0.15E+00    5.87        0.0004117778
> -5564.5858329292
>      3  OT DIIS        0.15E+00    5.36        0.0001749535
> -5565.0926583628
>      4  OT DIIS        0.15E+00    5.35        0.0001079888
> -5565.1916525889
>      5  OT DIIS        0.15E+00    5.38        0.0000649429
> -5565.2419192651
>      6  OT DIIS        0.15E+00    5.39        0.0000505161
> -5565.2547994107
>      7  OT DIIS        0.15E+00    5.39        0.0000358583
> -5565.2680981005
>      8  OT DIIS        0.15E+00    5.37        0.0000242444
> -5565.2759521645
>      9  OT DIIS        0.15E+00    5.77        0.0000190079
> -5565.2788928462
>     10  OT DIIS        0.15E+00    5.36        0.0000140638
> -5565.2811183985
>     11  OT DIIS        0.15E+00    5.35        0.0000103841
> -5565.2824228584
>     12  OT DIIS        0.15E+00    5.40        0.0000080827
> -5565.2830846010
>     13  OT DIIS        0.15E+00    5.39        0.0000069870
> -5565.2833844929
>     14  OT DIIS        0.15E+00    5.36        0.0000062234
> -5565.2836026369
>     15  OT DIIS        0.15E+00    4.08                 NaN
> -5565.2839809866
>     16  OT DIIS        0.15E+00    3.70
> NaN                 NaN
>     17  OT DIIS        0.15E+00    4.12
> NaN                 NaN
>     18  OT DIIS        0.15E+00    3.73
> NaN                 NaN
>     19  OT DIIS        0.15E+00    3.70
> NaN                 NaN
>     20  OT DIIS        0.15E+00    4.13
> NaN                 NaN
>     21  OT DIIS        0.15E+00    4.08
> NaN                 NaN
>     22  OT DIIS        0.15E+00    4.15
> NaN                 NaN
>     23  OT DIIS        0.15E+00    3.69
> NaN                 NaN
>     24  OT DIIS        0.15E+00    4.12
> NaN                 NaN
>     25  OT DIIS        0.15E+00    3.70
> NaN                 NaN
>     26  OT DIIS        0.15E+00    4.09
> NaN                 NaN
>     27  OT DIIS        0.15E+00    4.15
> NaN                 NaN
>     28  OT DIIS        0.15E+00    3.69
> NaN                 NaN
>     29  OT DIIS        0.15E+00    3.72
> NaN                 NaN
>     30  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     31  OT DIIS        0.15E+00    3.72
> NaN                 NaN
>     32  OT DIIS        0.15E+00    3.73
> NaN                 NaN
>     33  OT DIIS        0.15E+00    3.74
> NaN                 NaN
>     34  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     35  OT DIIS        0.15E+00    4.15
> NaN                 NaN
>     36  OT DIIS        0.15E+00    3.72
> NaN                 NaN
>     37  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     38  OT DIIS        0.15E+00    4.12
> NaN                 NaN
>     39  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     40  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     41  OT DIIS        0.15E+00    3.70
> NaN                 NaN
>     42  OT DIIS        0.15E+00    3.71
> NaN                 NaN
>     43  OT DIIS        0.15E+00    3.72
> NaN                 NaN
>     44  OT DIIS        0.15E+00    3.76
> NaN                 NaN
>     45  OT DIIS        0.15E+00    3.74
> NaN                 NaN
>     46  OT DIIS        0.15E+00    4.16
> NaN                 NaN
>     47  OT DIIS        0.15E+00    4.62
> NaN                 NaN
>     48  OT DIIS        0.15E+00    3.73
> NaN                 NaN
>     49  OT DIIS        0.15E+00    4.19
> NaN                 NaN
>     50  OT DIIS        0.15E+00    4.17
> NaN                 NaN
>     51  OT DIIS        0.15E+00    4.14
> NaN                 NaN
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>
> When this happens, I delete the wavefunction restart file and the job
> runs well again.
> I guess it might be the problem of unreasonable wavefunction restart
> file, which is for the snapshot 20 MD steps before and may not be an
> appropriate initial guess of the current one. If I don't use the
> restart file, it takes quite a few times longer for one normal
> calculation. Is there any way to get over the problem?
> You can find the relevant files packed in err.tgz.
> Many thanks in advance.
>
> Jun
>
>
>
> >
>   




More information about the CP2K-user mailing list