Great, I will reach privately to you Alfio. Thank you<br /><br /><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Friday, November 8, 2024 at 2:37:01 PM UTC+1 Alfio Lazzaro wrote:<br/></div><blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Ciao Daniele,<div>The output</div><div><br></div><div><span style="background-color:rgb(255,255,0)">DBCSR| ACC: GPU backend is enabled T (D)</span></div><div><br></div><div>is from DBCSR. Yes, it was added in 2024.2 (only the print). Clearly, it was "T" (==TRUE) also in 2024.1, only the print is now added (with a way to disable it). So, no changes from the functional side.</div><div>Still, the new DBCSR provides all kernels in 2024.2 provides AMD kernels, with a quite large boost in performance on LUMI (and this is what I suggested to Emanuele). </div><div><br></div><div>But then the error you see is on FFT, which I'm unfamiliar with...</div><div><br></div><div>Please reach out to me privately for details on LUMI. The best is to open a ticket on the LUMI system and ask for advice if there is support for CP2K (this is really a support of the application). There are multiple channels:</div><div>1) LUMI coffee-breaks (once per month), see <a href="https://www.lumi-supercomputer.eu/events/usercoffeebreaks/" target="_blank" rel="nofollow" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://www.lumi-supercomputer.eu/events/usercoffeebreaks/&source=gmail&ust=1731159637481000&usg=AOvVaw1VPW6duqGiGeSHVcXMmCvu">https://www.lumi-supercomputer.eu/events/usercoffeebreaks/</a> for the past event</div><div>2) LUMI porting application project, see <a href="https://www.lumi-supercomputer.eu/open-call-for-porting-optimizing-gpu-2024/" target="_blank" rel="nofollow" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://www.lumi-supercomputer.eu/open-call-for-porting-optimizing-gpu-2024/&source=gmail&ust=1731159637481000&usg=AOvVaw3C9knZhw0XG8QiTt_sTvMn">https://www.lumi-supercomputer.eu/open-call-for-porting-optimizing-gpu-2024/</a> for the past call</div><div>3) LUMI hackathons (at least once per year)</div><div><br></div><div>Alfio</div><div><br><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno giovedì 7 novembre 2024 alle 15:19:57 UTC+1 Daniele Passerone ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font size="4" face="Arial">Dear forum, </font><div><font size="4" face="Arial"><br></font></div><div><font size="4" face="Arial">Recently the supercomputer LUMI has been upgraded with the LUMI/24.03 software environment. </font></div><div><font size="4" face="Arial">With the version 23.09 we could run on the GPU partition (8 Gpu per node), following the prescription:</font></div><div><i><font size="4" face="Arial"><br></font></i></div><div><ul style="box-sizing:inherit;margin-bottom:1em;margin-top:1em;padding:0px;margin-left:0.625em;display:flow-root;letter-spacing:-0.25px"><li style="box-sizing:inherit;margin-bottom:0px;margin-left:1.25em"><i><font size="4" face="Arial">When running on LUMI-G, run using 8 MPI ranks per compute node, where each rank has access to 1 GPU in the same NUMA zone. This also means that you have to <span style="box-sizing:inherit;font-feature-settings:"kern";direction:ltr;font-variant-ligatures:none;border-radius:0.1rem;padding:0px 0.294118em;word-break:break-word;outline:none">OMP_NUM_THREADS=6-7</span> to utilize all CPU cores. Please note that using all 64 cores will not work as the first core in each CCD is reserved for the operating system, so that only 56 cores are available.</font></i></li></ul><div><font size="4" face="Arial"><span style="letter-spacing:-0.25px">The version we use on the old environment 23.09 was </span></font></div></div><div>
<p><font size="4" face="Arial">CP2K/2024.1-cpeGNU-23.09-GPU</font></p><p><font size="4" face="Arial"><br></font></p><p><font size="4" face="Arial">(easybuild)</font></p><p><font size="4" face="Arial"><br></font></p><p><font size="4" face="Arial">Which is described on the LUMI website as </font></p><p><font size="4" face="Arial"><br></font></p><p><font size="4" face="Arial"><i>"<span style="letter-spacing:-0.25px">CP2K 2024.1 release compiled with AMD GPU support enabled for CP2K itself and several of the libraries (SpFFT, SpLA). Cray Programming Environment 23.09 used together with the unsupported </span><span style="box-sizing:inherit;font-feature-settings:"kern";direction:ltr;font-variant-ligatures:none;border-radius:0.1rem;padding:0px 0.294118em;word-break:break-word;outline:none;letter-spacing:-0.25px">rocm/5.6.1</span><span style="letter-spacing:-0.25px"> module installed by the LUMI Support Team."</span></i></font></p><p><font size="4" face="Arial"><i><span style="letter-spacing:-0.25px"><br></span></i></font></p><p><font size="4" face="Arial"><span style="letter-spacing:-0.25px">With the new environment, we are advised to compile accordingly, using easybuild. </span></font></p><p><font size="4" face="Arial"><span style="letter-spacing:-0.25px"><br></span></font></p><p><a href="https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=en&q=https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/&source=gmail&ust=1731159637481000&usg=AOvVaw3_Z6atTmlBNgI-_E3F6ErV">https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/</a><font size="4" face="Arial"><span style="letter-spacing:-0.25px"></span></font></p><p><br></p><p>The code was compiled (2024.2) , but then DFT SCF steps fail with an error like that:</p><p><span style="background-color:yellow"><br></span></p><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow"><br>*******************************************************************************</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* ___</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* / \</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* [ABORT] </span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* \___/ G vector not found</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* | </span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* O/| </span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* /| | </span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">* / \ pw/pw_grids.F:1848</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*</span></div><div style="color:rgb(0,0,0);font-family:Aptos"><span style="background-color:yellow">*******************************************************************************</span></div><p>or during an initial part of the run </p><p><br></p><p><span><span> </span></span></p><p><span><span> </span></span></p><p><span style="background-color:yellow"><span> </span>*** WARNING in atoms_input.F:123 :: Overwriting coordinates. Active<span> </span>*** <span> </span></span></p><p><span style="background-color:yellow"><span> </span>*** coordinates read from &COORD section. Active coordinates READ from *** <span> </span></span></p><p><span style="background-color:yellow"><span> </span>*** &COORD section <span> </span>***<span> </span></span></p><p>
</p><p><span style="background-color:yellow"> </span></p><p>in which the job quits without any error message. </p><p><br></p><p>Questions:</p><p><br></p><p><font size="5">1) Is there somebody who can help me understanding why those jobs fail, and how to properly compile cp2k on lumi?</font></p><p><br></p><p>The LUMI support (Emanuele Vitali) discovered that the newest cp2k version (with 24.03 environment, CP2K 2024.2) has a line in the output:</p><p><br></p><p>
</p><p><span style="background-color:yellow"> DBCSR| ACC: GPU backend is enabled T (D)</span></p><p>that is not present in the CP2K 2024.1 compiled with 23.09. </p><p>So his hypothesis was that the CP2K 2024.1 that was working well was <b>NOT using GPU support., and that the problems in 2024.2 24.03 come from trying to use GPU support. </b></p><p>In my opinion (and also Marcella Iannuzzi's) this makes no sense, since we are sure that the scaling and performance (1 RANK - 1 GPU) was going well with the old version.</p><p><font size="5">2) Is it true that the line "GPU backend is enabled" was added in 2024.2?</font></p><p><br></p><p>Thank you for any help, </p><p>Daniele</p><p><br></p><p> </p><p><br></p></div></blockquote></div></blockquote></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups "cp2k" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br />
To view this discussion visit <a href="https://groups.google.com/d/msgid/cp2k/24f386a8-3b0c-46cd-8bd7-683073cc99c9n%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/cp2k/24f386a8-3b0c-46cd-8bd7-683073cc99c9n%40googlegroups.com</a>.<br />