out of memory error when running normal modes
ada.a.... at gmail.com
Tue Nov 21 16:06:58 UTC 2017
I am trying to run a phonon calculation on a system with 480 atoms, using
98 nodes each with 32 logical cores, an intel E5-2670. I successfully ran
it using 72 nodes, but the results were not quite satisfactory, so I tried
increasing the PW cut-off, from 600 to 800. I used 512 NPROC_REP for 64
nodes, which seemed to be fastest, and 576 for 72 nodes. Using 72 nodes, I
just barely finished in 24 hours, which is my limit for jobs. So with the
extra PW cutoff I knew I needed more nodes to finish the job.
The first time I tried with 96 nodes, I used over 768 for NPROC_REP, and
the job crashed with an out-of memory error writing to one of the wfn
restart files. I tied again using 512, and it ran for 9 hours, and then
crashed with the same error.
Is there a way to know how many NPROC_REP to use with this many nodes so
that the memory is not overwhelmed, or if maybe there is even another
problem? I am not sure exactly why more nodes leads to these errors when
the job is fine with less nodes.
Thank you so much.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CP2K-user