out of memory error when running normal modes

Ada Sedova ada.a.... at gmail.com
Tue Nov 21 16:06:58 UTC 2017


Hi,

I am trying to run a phonon calculation on a system with 480 atoms, using 
98 nodes each with 32 logical cores, an intel E5-2670. I successfully ran 
it using 72 nodes, but the results were not quite satisfactory, so I tried 
increasing the PW cut-off, from 600 to 800. I used 512 NPROC_REP for 64 
nodes, which seemed to be fastest, and 576 for 72 nodes. Using 72 nodes, I 
just barely finished in 24 hours, which is my limit for jobs. So with the 
extra PW cutoff I knew I needed more nodes to finish the job.

The first time I tried with 96 nodes, I used over 768 for NPROC_REP, and 
the job crashed with an out-of memory error writing to one of the wfn 
restart files. I tied again using 512, and it ran for 9 hours, and then 
crashed with the same error. 

Is there a way to know how many NPROC_REP to use with this many nodes so 
that the memory is not overwhelmed, or if maybe there is even another 
problem? I am not sure exactly why more nodes leads to these errors when 
the job is fine with less nodes.

Thank you so much.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20171121/e94910f8/attachment.htm>


More information about the CP2K-user mailing list