[CP2K-user] Job setup: how to run multiple single point calculations efficiently?

Sam Niblett snibl... at gmail.com
Mon Apr 6 16:54:36 UTC 2020


Dear all,

I need to perform a reasonably large number of energy+force single point 
calculations for distinct configurations of a single molecular system 
(~1000-10000 in the first instance but lots more later on). I'm using DFT 
with a rev-PBE functional, running a pre-compiled CP2K module on a large 
supercomputer. I'm on a fairly tight budget of core hours, so minimising 
the runtime is my main concern.

Is there a keyword that will instruct CP2K to perform the same operation on 
a series of different starting configurations (preferably read from a 
single xyz file, for example)? Something along the lines of LAMMPS' rerun 
command. 



I have looked through the CP2K_INPUT documentation and the best option I 
could find is to use FARMING to perform a separate ENERGY_FORCE job on each 
starting configuration. This works, but it is undesirable for three reasons:

1) It requires creating a separate input folder for each configuration (not 
a big problem, but it's annoying and inelegant given that all the input 
except the system coordinates is identical for every job)

2) This method appears to reallocate and reinitialise the functionals and 
system details for every set of input data, which is unnecessary overhead 
given that those details are the same each time.

3) My starting configurations are similar enough that the converged 
electron density of one should be a good starting point for the next. By 
analogy with AIMD calculations I have run on the same system, I estimate 
that using this information could decrease the cost of the calculation by 
up to 80%. But FARMING doesn't know that, so each calculation starts 
completely from scratch.

The result is that FARMING only gives a small speedup compared to running 
separate CP2K jobs for each input point. Does anyone know of a better (i.e. 
more efficient) way to set up these calculations? 



The only other thing I could think of would be using BAND with 0 
optimisation steps and K_SPRING 0, treating each configuration of my input 
as a separate replica. I don't know if that would give me the information I 
want but if it did then it would fix at least points 1 and 2 of my list 
above. If anyone has tried something like that before, please let me know 
how you got on.

I'm hoping that there's a straightforward way to perform this task and I 
just haven't found it in the documentation. Please point me to the relevant 
page if so.

Thanks, and best wishes,

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200406/159ea041/attachment.htm>


More information about the CP2K-user mailing list