<div dir="ltr">Thanks for your advices!<div><br></div><div>Now I can run it at least, It is so slow but run. The difference between the little and big cluster was that, in the little one, the total RAM consumption is practically MPI_PROCESS*(<span style="color:rgba(0,0,0,0.87);font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;font-size:14px">Baseline + MAX_MEMORY + 2 full matrices), as Prof. Hutter explains, but in the big one, </span> there are some cluster process which consume 5 or 10% of each nodes... then I had to optimize the MAX_MEMORY doing some test...</div><div><br></div><div>About the ERIs, it is so difficult to have 7 TB for them... I can take 4 TB without problem... But to take the whole cluster section is difficult. I try using the SCREENING option to speed it up, taking some ERIs on the fly.</div><div><br></div><div>Regards - Lucas Lodeiro</div><div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El lun, 23 nov 2020 a las 14:18, <a href="mailto:fa...@gmail.com">fa...@gmail.com</a> (<<a href="mailto:fabia...@gmail.com">fabia...@gmail.com</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Your graph nicely shows that cp2k runs out of memory. As Matt wrote, you have to decrease
MAX_MEMORY to allow enough memory for the rest of the programm. Here are some details on memory consumption with HF: <a href="https://groups.google.com/g/cp2k/c/DZDVTIORyVY/m/OGjJDJuqBwAJ" target="_blank">https://groups.google.com/g/cp2k/c/DZDVTIORyVY/m/OGjJDJuqBwAJ</a><br></div><div><br></div><div>Of course you can recalculate some the ERI's in each SCF cycle. But that slows down the minimization by a lot, I'd advise against doing that. Try to use screening, set a proper value for MAX_MEMORY, and use all the resources you have to store the ERI's</div><div><br></div><div>Fabian<br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Sunday, 22 November 2020 at 23:08:17 UTC+1 Lucas Lodeiro wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Fabian and Matt,<div><br></div><div>About the access to the memory, I ran calculations without problems for months, using 90% of the node RAM without problems. But to check I set ulimit -s unlimited. There are some changes, before using ulimit, the calculation crashes and the use of RAM was so low (15%), after using ulimit, the calculation crashes equally, but the use of RAM shows a sustained rise to the limit and then the calculation crashes. This is a change. I adjunct an image.</div><div><br></div><div>About the SCREEN_ON_INITIAL_P, I will use it in the little cluster. I like the idea of running 2 calculations as climbing steps.<br></div><div><br></div><div>I know that the number of the ERIs calculated on the fly should be 0, and if it is different from zero, I need to use more RAM to store them and to not calculate them at each scf step. But in the case of the little cluster, I am using all processors and RAM resources. But the way, the calculation runs without problems when ERIs calculated on the fly at each scf step, just is very slow.</div><div><br></div><div>About what Matt comments. In the little cluster, I have a single node with 250GB RAM. Then I use MAX_MEMORY = 2600, this is a total of 166.4 GB for the ERIS (the output informs 143 GB), and the rest for the whole program. </div><div>In the case of the big cluster, we have access to many nodes with 44 proc and 192GB RAM, and 9 nodes with 44 proc and 768GB RAM. In the first case, I use 5 nodes (220 proc) using all memory (960GB), setting MAX_MEMORY = 4000 (4.0 GB * 220 proc = 880 GB RAM for ERIs). In the second case, I use 5 nodes (220 proc) using all memory (3840GB), setting MAX_MEMORY = 15000 (15.0 GB * 220 proc = 3300 GB RAM for ERIs).</div><div>In both cases the calculation crashes... I do not know if I am so credulous , but 3.3 TB of RAM seems, at least, enough to store so many of the ERIs...</div><div><br></div><div>Using the data informed in the output of little cluster:</div></div><div dir="ltr"><div> HFX_MEM_INFO| Number of sph. ERI's calculated: 4879985997918<br> HFX_MEM_INFO| Number of sph. ERI's stored in-core: 116452577779<br> HFX_MEM_INFO| Number of sph. ERI's stored on disk: 0<br> HFX_MEM_INFO| Number of sph. ERI's calculated on the fly: 4763533420139<br></div><div><br></div></div><div dir="ltr"><div>The stored ERI's are the 1/42 of the total ERIs, and use 166.4 GB (143 GB informed)... Then if I want to store all of them, I need 166.4 GB * 42 = ~7.0 TB... Is that correct?</div><div>I can get 7.0 TB RAM using 9 nodes with 768 GB RAM each one. But I am not so clear about the idea that the amount of RAM is the problem, because in the little cluster it runs, calculating almost all ERIs at each scf step...</div><div><br></div><div>I am a little surprised why the calculation runs in the little cluster, but not in the big one.</div><div>Do you guess some other related problem?</div><div><br></div><div>Regards - Lucas</div><div><br></div><div><br></div></div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El dom, 22 nov 2020 a las 13:55, Matt W (<<a rel="nofollow">mat...@gmail.com</a>>) escribió:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Your input has<div><br></div><div><div> &MEMORY</div><div> MAX_MEMORY 4000</div><div> EPS_STORAGE_SCALING 0.1</div><div> &END MEMORY</div><div><br></div><div>This means that each MPI task (which can be multiple cores) should be able to allocate 4GBi of memory _exclusively_ for the 2 electron integrals. If there is less than that available it will crash as the memory allocation can't occur. I guess your main cluster has less memory than the smaller one. You need to leave space for the operating system and the rest of the cp2k run besides the 2 electron integrals.</div><div><br></div><div>There is another thread where Juerg answers HFX memory in more detail form earlier this year.</div><div><br></div><div>Matt</div><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Sunday, November 22, 2020 at 4:42:47 PM UTC <a rel="nofollow">fa...@gmail.com</a> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Can cp2k access all the memory on the cluster? On linux you can use <br></div><div>ulimit -s unlimited</div><div>to remove any limit on the amount of memory a process can use.</div><div><br></div><div>I usually use
SCREEN_ON_INITIAL_P. I found that for large systems it is faster to run two energy minimizations with the key word enabled (such that the second restarts from a converged PBE0 wfn) than running a single minimization without SCREEN_ON_INITIAL_P. But that probably depends on the system you simulate.</div><div><br></div><div>You should converge the cutoff with respect to the properties that you are interested in. Run a test system with increasing cutoff and look at, e.g. the energy, pdos, etc.<br>
</div><div><br></div><div>
Number of sph. ERI's calculated on the fly: 4763533420139 <br></div><div>This number should always be 0. If it is larger, increase the memory cp2k has available.<br></div><div><br></div><div>Fabian<br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Sunday, 22 November 2020 at 17:24:13 UTC+1 Lucas Lodeiro wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dear Fabian,<div><br></div><div>Thanks for your advise. I forgot to clarify the time ejecution... my mistake. </div><div>The calculation runs for 5 or 7 minutes, and stops... the walltime for the calculation was set as 72hrs, then I do not believe this is the problem. Now I am running the same input in a littler cluster (different form the problematic crash) with 64 proc and 250GB RAM, and the calculation works fine (so so slow, 9 hr per scf step, but runs... the total RAM assigned for the ERI's is not sufficient but the problem is not appear)... It is no practical to use this little cluster, then I need to fix the problem in the big one, to use more RAM and more processors (more than 220 it is possible), but as the program does not show what is happening, I cannot tell anything to the cluster admin to recompile or fix the problem. :(</div><div><br></div><div>This is the output in the little cluster:</div><div><br></div><div> Step Update method Time Convergence Total energy Change<br> ------------------------------------------------------------------------------<br><br> HFX_MEM_INFO| Est. max. program size before HFX [MiB]: 1371<br><br> *** WARNING in hfx_energy_potential.F:605 :: The Kohn Sham matrix is not ***</div></div><div dir="ltr"><div><br> *** 100% occupied. This may result in incorrect Hartree-Fock results. Try ***<br> *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section. For ***<br> *** more information see FAQ: <a href="https://www.cp2k.org/faq:hfx_eps_warning" rel="nofollow" target="_blank">https://www.cp2k.org/faq:hfx_eps_warning</a> ***<br><br></div></div><div dir="ltr"><div> HFX_MEM_INFO| Number of cart. primitive ERI's calculated: 27043173676632<br> HFX_MEM_INFO| Number of sph. ERI's calculated: 4879985997918<br> HFX_MEM_INFO| Number of sph. ERI's stored in-core: 116452577779<br> HFX_MEM_INFO| Number of sph. ERI's stored on disk: 0<br> HFX_MEM_INFO| Number of sph. ERI's calculated on the fly: 4763533420139<br> HFX_MEM_INFO| Total memory consumption ERI's RAM [MiB]: 143042<br> HFX_MEM_INFO| Whereof max-vals [MiB]: 1380<br> HFX_MEM_INFO| Total compression factor ERI's RAM: 6.21<br> HFX_MEM_INFO| Total memory consumption ERI's disk [MiB]: 0<br> HFX_MEM_INFO| Total compression factor ERI's disk: 0.00<br> HFX_MEM_INFO| Size of density/Fock matrix [MiB]: 266<br> HFX_MEM_INFO| Size of buffers [MiB]: 98<br> HFX_MEM_INFO| Number of periodic image cells considered: 5<br> HFX_MEM_INFO| Est. max. program size after HFX [MiB]: 3834<br><br> 1 NoMix/Diag. 0.40E+00 ****** 5.46488333 -20625.2826573514 -2.06E+04<br></div><div><br></div><div>About the SCREEN_ON_INITIAL_P, I read that to use it, you need a very good guess (more than de GGA converged one) as for example the last step or frame from a GEO_OPT or MD... Is it really useful when the guess is the GGA wavefunction?</div><div>About the CUTOFF_RADIUS, I read that 6 or 7 it is a good compromise, and as my cell is approximately twice, I use the minimal image convention to decide the 8.62 number which is near the recomended (6 or 7). If I reduce it, does the computational cost diminish considerably?</div><div><br></div><div>Regards - Lucas</div></div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El dom, 22 nov 2020 a las 12:53, <a rel="nofollow">fa...@gmail.com</a> (<<a rel="nofollow">fa...@gmail.com</a>>) escribió:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Dear Lucas,</div><div><br></div><div>cp2k was computes the four-center integrals during (or prior) to the first SCF cycle.
I assume the job ran out of time during this task For a system with more than 1000 atoms this takes a lot of time. With only 220 CPUs this could take several hours in fact.</div><div><br></div><div>To speed up the calculation you should use SCREEN_ON_INITIAL_P T and restart using a well converged PBE wfn. Other than that, there is little you can do other than assign the job more time and/or CPUs. (Of course, reducing CUTOFF_RADIUS 8.62 would help too but could negatively affect the result).</div><div><br></div><div>Cheers,</div><div>Fabian<br></div><br><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Sunday, 22 November 2020 at 01:21:05 UTC+1 Lucas Lodeiro wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi all, <div>I need to perform a hybrid calculation with CP2K7.1, over a big system (+1000 atoms). I study the manual, the tutorials and some videos of CP2K developers to improve my input. But the program exits the calculation when the HF part is running... I see the memory usage on the fly, and there is no peak which explains the fail (I used 4000Mb with 220 processors).</div><div>The output does not show some explanation... Thinking in the memory, I try with a largemem node at our cluster, using 15000Mb with 220 processors, but the program exists at the same point without message, just killing the process. </div><div>The output shows a warning:</div><div><br></div><div> *** WARNING in hfx_energy_potential.F:591 :: The Kohn Sham matrix is not ***<br> *** 100% occupied. This may result in incorrect Hartree-Fock results. Try ***<br> *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section. For ***<br> *** more information see FAQ: <a href="https://www.cp2k.org/faq:hfx_eps_warning" rel="nofollow" target="_blank">https://www.cp2k.org/faq:hfx_eps_warning</a> ***<br></div><div><br></div><div>but I read this is not a very complicated issue, and the calculation has to continue and not crash</div><div>Also I decrease the EPS__PGF_ORB, but the warning and the problem persist. </div><div><br></div><div>I do not know if the problem could be located in other parts of my input... for example I use the PBE0-T_C-LR (I use PBC for XY), and ADMM. In the ADMM options, I use ADMM_PURIFICATION_METHOD = NONE, due to I read that ADMM1 is the only one useful for smearing calculations. </div><div><br></div><div>I run this system with PBE (for the first guess of PBE0), and there is no problem in that case.</div><div>Moreover, I try with other CP2K versions (7.0, 6.1 and 5.1) compiled into the cluster with (libint_max_am=6), and the calculation crash, but show this problem:</div><div><br></div><div> *******************************************************************************<br> * ___ *<br> * / \ *<br> * [ABORT] *<br> * \___/ CP2K and libint were compiled with different LIBINT_MAX_AM. *<br> * | *<br> * O/| *<br> * /| | *<br> * / \ hfx_libint_wrapper.F:134 *<br> *******************************************************************************<br><br><br> ===== Routine Calling Stack ===== <br><br> 2 hfx_create<br> 1 CP2K<br></div><div><br></div><div>It seems like this problem is not present in the 7.1 version, as the program does not show it, and the compilation information does not show LIBINT_MAX_AM value...</div><div><br></div><div>If somebody could give me some advice, I will appreciate it. :)</div><div>I attach the input file, and the output file for 7.1 version.</div><div><br></div><div>Regards - Lucas Lodeiro</div><div><br></div></div>
</blockquote></div>
<p></p></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank">https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div></blockquote></div>
<p></p>
-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/aa6b0a55-9d21-4da6-a3bb-f6f62ea0768bn%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank">https://groups.google.com/d/msgid/cp2k/aa6b0a55-9d21-4da6-a3bb-f6f62ea0768bn%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>
<p></p>
-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp...@googlegroups.com" target="_blank">cp...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/87e96bc0-fa01-43ee-9d95-635a45db2d41n%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank">https://groups.google.com/d/msgid/cp2k/87e96bc0-fa01-43ee-9d95-635a45db2d41n%40googlegroups.com</a>.<br>
</blockquote></div>