[CP2K-user] massive slowdown in SCF timings during MD
Sean Fischer
safisc... at gmail.com
Thu Oct 17 11:09:49 UTC 2019
The issue I was having appears to be tied to the version of OpenMPI that I
was using. I'm not sure if it is just my particular build of OpenMPI
v1.10.2 or if there is a bug somewhere, but moving to more recent versions
of OpenMPI appears to have eliminated the increasing time per step.
On Friday, October 11, 2019 at 1:28:21 PM UTC-4, Sean Fischer wrote:
>
> Hi Matt,
>
> I agree that some of the behavior does look like my resources are being
> oversubscribed, but I am sure that I have dedicated access to the nodes I
> am using, and I see nothing unusual with top on the nodes I'm using. I have
> another test that I ran where I did not get the large fluctuations but
> still have the linear trend, though this one was stable for longer before
> the increase. See below, and the input file for that run is attached (same
> initial geometry but some changes to the other settings).
>
> [image: md_timings2.png]
>
> Best,
> Sean
>
> On Friday, October 11, 2019 at 12:10:01 PM UTC-4, Matt W wrote:
>>
>> Hi Sean,
>>
>> are you sure you have dedicated access to the resource you run on? Those
>> large fluctuations look likely to be caused by other jobs running on the
>> node(s) you are using (though the general linearish trend is surprising).
>>
>> Matt
>>
>> On Friday, October 11, 2019 at 1:41:03 PM UTC+1, Sean Fischer wrote:
>>>
>>> I'm attempting to run MD using QS on a 64 molecule water box. So far I
>>> have been unable to achieve consistent performance across the MD
>>> trajectory. After a while the time per MD step dramatically increases, but
>>> the number of SCF iterations per step stays basically constant, as you can
>>> see in the included image.
>>>
>>> [image: md_timing.png]
>>>
>>> To further illustrate this, the following is from one of the first steps
>>> in the trajectory
>>>
>>>
>>> ----------------------------------- OT
>>> ---------------------------------------
>>>
>>> Minimizer : DIIS : direct inversion
>>>
>>> in the iterative subspace
>>>
>>> using 7 DIIS vectors
>>>
>>> safer DIIS on
>>>
>>> Preconditioner : FULL_ALL : diagonalization, state
>>> selective
>>>
>>> Precond_solver : DEFAULT
>>>
>>> stepsize : 0.15000000 energy_gap :
>>> 0.08000000
>>>
>>> eps_taylor : 0.10000E-15 max_taylor :
>>> 4
>>>
>>> ----------------------------------- OT
>>> ---------------------------------------
>>>
>>>
>>> Step Update method Time Convergence Total energy
>>> Change
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> 1 OT DIIS 0.15E+00 2.3 0.00030148 -1107.9047914512
>>> -1.11E+03
>>>
>>> 2 OT DIIS 0.15E+00 1.6 0.00012826 -1107.9089080136
>>> -4.12E-03
>>>
>>> 3 OT DIIS 0.15E+00 1.6 0.00007994 -1107.9095257164
>>> -6.18E-04
>>>
>>> 4 OT DIIS 0.15E+00 1.6 0.00000744 -1107.9096948950
>>> -1.69E-04
>>>
>>> 5 OT DIIS 0.15E+00 2.0 0.00000377 -1107.9096964458
>>> -1.55E-06
>>>
>>> 6 OT DIIS 0.15E+00 1.6 0.00000067 -1107.9096967653
>>> -3.19E-07
>>>
>>>
>>> *** SCF run converged in 6 steps ***
>>>
>>>
>>>
>>> Electronic density on regular grids: -511.9999993695
>>> 0.0000006305
>>>
>>> Core density on regular grids: 511.9999999846
>>> -0.0000000154
>>>
>>> Total charge density on r-space grids: 0.0000006151
>>>
>>> Total charge density g-space grids: 0.0000006151
>>>
>>>
>>>
>>> While after ~2500 steps, the same section now looks like
>>>
>>>
>>> ----------------------------------- OT
>>> ---------------------------------------
>>>
>>> Minimizer : DIIS : direct inversion
>>>
>>> in the iterative subspace
>>>
>>> using 7 DIIS vectors
>>>
>>> safer DIIS on
>>>
>>> Preconditioner : FULL_ALL : diagonalization, state
>>> selective
>>>
>>> Precond_solver : DEFAULT
>>>
>>> stepsize : 0.15000000 energy_gap :
>>> 0.08000000
>>>
>>> eps_taylor : 0.10000E-15 max_taylor :
>>> 4
>>>
>>> ----------------------------------- OT
>>> ---------------------------------------
>>>
>>>
>>> Step Update method Time Convergence Total energy
>>> Change
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> 1 OT DIIS 0.15E+00 6.3 0.00003184 -1108.0316770158
>>> -1.11E+03
>>>
>>> 2 OT DIIS 0.15E+00 17.0 0.00001599 -1108.0317180830
>>> -4.11E-05
>>>
>>> 3 OT DIIS 0.15E+00 18.4 0.00000956 -1108.0317247395
>>> -6.66E-06
>>>
>>> 4 OT DIIS 0.15E+00 21.9 0.00000119 -1108.0317273970
>>> -2.66E-06
>>>
>>> 5 OT DIIS 0.15E+00 6.8 0.00000063 -1108.0317274283
>>> -3.14E-08
>>>
>>>
>>> *** SCF run converged in 5 steps ***
>>>
>>>
>>>
>>> Electronic density on regular grids: -511.9999994472
>>> 0.0000005528
>>>
>>> Core density on regular grids: 511.9999999845
>>> -0.0000000155
>>>
>>> Total charge density on r-space grids: 0.0000005373
>>>
>>> Total charge density g-space grids: 0.0000005373
>>>
>>>
>>>
>>> This has happened with different functionals, pseudopotentials, and SCF
>>> parameters (PS vs ASPC extrapolation, different orders of extrapolation, OT
>>> DIIS vs DIIS/Diag., different preconditioners). I recompiled the whole code
>>> with the toolchain install script generating all dependencies and got the
>>> same result. There is no indication from the dynamics of the system itself
>>> as to what the problem might be as the potential and kinetic energies are
>>> well behaved, and the constant of motion for the dynamics is reasonably
>>> conserved. I've watched the resulting trajectory itself, and there is
>>> nothing unusually happening (no bonding breaking, etc.) The input file I
>>> used for run in the plot above is attached. The starting geometry for the
>>> water molecules was taken from the 64 molecule benchmark in the tests/QS
>>> directory.
>>>
>>>
>>> If anyone has any suggestions on possible causes or where I can look for
>>> more information, I would be very appreciative.
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20191017/76f5711c/attachment.htm>
More information about the CP2K-user
mailing list