Hey I have identified the problem area, but not exact cause. I was relaxing argon for this example.<div><br></div><div>I found that the issue is something in the parallelization of kpoints. When I run without mpirun, it always runs okay. I was running with 128 MPI processes (one per core), and it failed. When I run with Monkhorst-pack grid, the calculation runs fine with 128 MPI processes, but the automatic grid has many more kpoints. So I rerun with the explicit grid but with 4 MPI processes instead, and the calculation proceeds.</div><div><br></div><div>There is some kind of "over"-parallelization problem that's happening here. Has this reported in the community at all?</div><div><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Saturday, October 22, 2022 at 10:53:40 AM UTC-7 Matthias Krack wrote:<br/></div><blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">



<div dir="auto">
On Argon?<br>
<br>
<div dir="ltr">Matthias</div>
<div dir="ltr"><br>
<blockquote type="cite">Am 22.10.2022 um 18:30 schrieb Nicholas Winner <<a href data-email-masked rel="nofollow">nwi...@berkeley.edu</a>>:<br>
<br>
</blockquote>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Matthias, 
</div></blockquote></div><div dir="auto"><blockquote type="cite"><div dir="ltr"><div><br>
</div>
<div>That's very strange. I tried a calculation again on Argon, but this does not fix the problem. </div>
<div><br>
</div>
<div>-N<br>
<br>
</div>
<div class="gmail_quote">
<div dir="auto" class="gmail_attr">On Friday, October 21, 2022 at 9:08:29 PM UTC-7 Matthias Krack wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="en-CH" link="blue" vlink="purple" style="word-wrap:break-word">
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Hello Nick<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">The failures are possibly caused by the keyword WAVEFUNCTION COMPLEX. I observe that the regression test QS/regtest-kp-2/cc2.inp often fails because of
<a href="https://github.com/cp2k/cp2k/issues/2117" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://github.com/cp2k/cp2k/issues/2117&source=gmail&ust=1666548655771000&usg=AOvVaw0GYADwFDx86fXLmmvvlV26">
this issue</a> related to zgemv of OpenBLAS.<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Best<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Matthias<u></u><u></u></span></p>
</div>
</div>
<div lang="en-CH" link="blue" vlink="purple" style="word-wrap:break-word">
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><u></u> <u></u></span></p>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:12.0pt;margin-left:36.0pt">
<b><span style="font-size:12.0pt;color:black">From: </span></b><span style="font-size:12.0pt;color:black"><a rel="nofollow">cp...@googlegroups.com</a> <<a rel="nofollow">cp...@googlegroups.com</a>> on
 behalf of Nicholas Winner <<a rel="nofollow">nwi...@berkeley.edu</a>><br>
<b>Date: </b>Saturday, 22 October 2022 at 01:17<br>
<b>To: </b>cp2k <<a rel="nofollow">cp...@googlegroups.com</a>><br>
<b>Subject: </b>[CP2K:17924] CP2K Hangs for CellOpt+kpoints in certain systems.<u></u><u></u></span></p>
</div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Hello all,<u></u><u></u></span></p>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">I'm doing some cell optimizations using kpoints with CP2K. I have never had a problem with the geo_opt module, but cell_opt is hanging very often for a number of systems. <u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Usually, the calculation will proceed for 1-3 optimization steps, and then it will hang at the start of a new SCF loop. I have found that sometimes the behavior is fixed by using
 direct_p_mixing instead of broyden_mixing, but this is not a consistent fix.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">I've noticed the problem using both v9.1 and v2022.1, I've also tried a build on 3 different clusters and found the same behavior. <u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Does anyone have experience with this and can offer advice? I've attached an example for Argon, which has this problem very consistently. The output goes until the calculation gets
 stuck.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Thanks,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Nick<u></u><u></u></span></p>
</div>
</div>
</div>
<div lang="en-CH" link="blue" vlink="purple" style="word-wrap:break-word">
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">--
<br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a rel="nofollow">cp2k+uns...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/d26b287a-624d-4ab4-88ee-052529111991n%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://groups.google.com/d/msgid/cp2k/d26b287a-624d-4ab4-88ee-052529111991n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1666548655771000&usg=AOvVaw1cUUEJGwBw8MVpRWp1UIIh">
https://groups.google.com/d/msgid/cp2k/d26b287a-624d-4ab4-88ee-052529111991n%40googlegroups.com</a>.<u></u><u></u></span></p>
</div>
</div>
</blockquote>
</div>
<p></p>
-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href data-email-masked rel="nofollow">cp2k+uns...@googlegroups.com</a>.<br></div></blockquote></div><div dir="auto"><blockquote type="cite"><div dir="ltr">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/35a086c1-1515-4b1b-bbe1-53c35bac7bd0n%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank" rel="nofollow" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://groups.google.com/d/msgid/cp2k/35a086c1-1515-4b1b-bbe1-53c35bac7bd0n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1666548655772000&usg=AOvVaw1R0W2D82LY7HSvBNGOo-S1">
https://groups.google.com/d/msgid/cp2k/35a086c1-1515-4b1b-bbe1-53c35bac7bd0n%40googlegroups.com</a>.<br>
</div>
</blockquote>
</div>

</blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "cp2k" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/7a4ad49e-ea23-482e-a932-071e270c21e6n%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/cp2k/7a4ad49e-ea23-482e-a932-071e270c21e6n%40googlegroups.com</a>.<br />