[CP2K-user] Job setup: how to run multiple single point calculations efficiently?

Sam Niblett snibl... at gmail.com
Mon Apr 6 16:54:36 UTC 2020

Previous message (by thread): [CP2K-user] How to testify the production species are radicals or not, during the MD simulation.
Next message (by thread): [CP2K-user] [CP2K:13065] Job setup: how to run multiple single point calculations efficiently?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear all,

I need to perform a reasonably large number of energy+force single point
calculations for distinct configurations of a single molecular system
(~1000-10000 in the first instance but lots more later on). I'm using DFT
with a rev-PBE functional, running a pre-compiled CP2K module on a large
supercomputer. I'm on a fairly tight budget of core hours, so minimising
the runtime is my main concern.

Is there a keyword that will instruct CP2K to perform the same operation on
a series of different starting configurations (preferably read from a
single xyz file, for example)? Something along the lines of LAMMPS' rerun
command.

I have looked through the CP2K_INPUT documentation and the best option I
could find is to use FARMING to perform a separate ENERGY_FORCE job on each
starting configuration. This works, but it is undesirable for three reasons:

1) It requires creating a separate input folder for each configuration (not
a big problem, but it's annoying and inelegant given that all the input
except the system coordinates is identical for every job)

2) This method appears to reallocate and reinitialise the functionals and
system details for every set of input data, which is unnecessary overhead
given that those details are the same each time.

3) My starting configurations are similar enough that the converged
electron density of one should be a good starting point for the next. By
analogy with AIMD calculations I have run on the same system, I estimate
that using this information could decrease the cost of the calculation by
up to 80%. But FARMING doesn't know that, so each calculation starts
completely from scratch.

The result is that FARMING only gives a small speedup compared to running
separate CP2K jobs for each input point. Does anyone know of a better (i.e.
more efficient) way to set up these calculations?

The only other thing I could think of would be using BAND with 0
optimisation steps and K_SPRING 0, treating each configuration of my input
as a separate replica. I don't know if that would give me the information I
want but if it did then it would fix at least points 1 and 2 of my list
above. If anyone has tried something like that before, please let me know
how you got on.

I'm hoping that there's a straightforward way to perform this task and I
just haven't found it in the documentation. Please point me to the relevant
page if so.

Thanks, and best wishes,

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200406/159ea041/attachment.htm>

Previous message (by thread): [CP2K-user] How to testify the production species are radicals or not, during the MD simulation.
Next message (by thread): [CP2K-user] [CP2K:13065] Job setup: how to run multiple single point calculations efficiently?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the CP2K-user mailing list