[CP2K-user] [CP2K:20885] Re: CP2K on LUMI
Daniele Passerone
dpasserone at gmail.com
Fri Nov 8 13:41:57 UTC 2024
Great, I will reach privately to you Alfio. Thank you
On Friday, November 8, 2024 at 2:37:01 PM UTC+1 Alfio Lazzaro wrote:
> Ciao Daniele,
> The output
>
> DBCSR| ACC: GPU backend is enabled
> T (D)
>
> is from DBCSR. Yes, it was added in 2024.2 (only the print). Clearly, it
> was "T" (==TRUE) also in 2024.1, only the print is now added (with a way to
> disable it). So, no changes from the functional side.
> Still, the new DBCSR provides all kernels in 2024.2 provides AMD kernels,
> with a quite large boost in performance on LUMI (and this is what I
> suggested to Emanuele).
>
> But then the error you see is on FFT, which I'm unfamiliar with...
>
> Please reach out to me privately for details on LUMI. The best is to open
> a ticket on the LUMI system and ask for advice if there is support for CP2K
> (this is really a support of the application). There are multiple channels:
> 1) LUMI coffee-breaks (once per month), see
> https://www.lumi-supercomputer.eu/events/usercoffeebreaks/ for the past
> event
> 2) LUMI porting application project, see
> https://www.lumi-supercomputer.eu/open-call-for-porting-optimizing-gpu-2024/
> for the past call
> 3) LUMI hackathons (at least once per year)
>
> Alfio
>
>
> Il giorno giovedì 7 novembre 2024 alle 15:19:57 UTC+1 Daniele Passerone ha
> scritto:
>
>> Dear forum,
>>
>> Recently the supercomputer LUMI has been upgraded with the LUMI/24.03
>> software environment.
>> With the version 23.09 we could run on the GPU partition (8 Gpu per
>> node), following the prescription:
>>
>>
>> - *When running on LUMI-G, run using 8 MPI ranks per compute node,
>> where each rank has access to 1 GPU in the same NUMA zone. This also means
>> that you have to OMP_NUM_THREADS=6-7 to utilize all CPU cores. Please note
>> that using all 64 cores will not work as the first core in each CCD is
>> reserved for the operating system, so that only 56 cores are available.*
>>
>> The version we use on the old environment 23.09 was
>>
>> CP2K/2024.1-cpeGNU-23.09-GPU
>>
>>
>> (easybuild)
>>
>>
>> Which is described on the LUMI website as
>>
>>
>> *"CP2K 2024.1 release compiled with AMD GPU support enabled for CP2K
>> itself and several of the libraries (SpFFT, SpLA). Cray Programming
>> Environment 23.09 used together with the unsupported rocm/5.6.1 module
>> installed by the LUMI Support Team."*
>>
>>
>> With the new environment, we are advised to compile accordingly, using
>> easybuild.
>>
>>
>> https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/
>>
>>
>> The code was compiled (2024.2) , but then DFT SCF steps fail with an
>> error like that:
>>
>>
>>
>>
>> *******************************************************************************
>> * ___
>> *
>> * / \
>> *
>> * [ABORT]
>> *
>> * \___/ G vector not found
>> *
>> * |
>> *
>> * O/|
>> *
>> * /| |
>> *
>> * / \ pw/pw_grids.F:1848
>> *
>>
>> *******************************************************************************
>>
>> or during an initial part of the run
>>
>>
>>
>>
>>
>>
>> *** WARNING in atoms_input.F:123 :: Overwriting coordinates. Active ***
>>
>>
>> *** coordinates read from &COORD section. Active coordinates READ from
>> ***
>>
>> *** &COORD section
>> ***
>>
>>
>>
>> in which the job quits without any error message.
>>
>>
>> Questions:
>>
>>
>> 1) Is there somebody who can help me understanding why those jobs fail,
>> and how to properly compile cp2k on lumi?
>>
>>
>> The LUMI support (Emanuele Vitali) discovered that the newest cp2k
>> version (with 24.03 environment, CP2K 2024.2) has a line in the output:
>>
>>
>> DBCSR| ACC: GPU backend is enabled
>> T (D)
>>
>> that is not present in the CP2K 2024.1 compiled with 23.09.
>>
>> So his hypothesis was that the CP2K 2024.1 that was working well was *NOT
>> using GPU support., and that the problems in 2024.2 24.03 come from trying
>> to use GPU support. *
>>
>> In my opinion (and also Marcella Iannuzzi's) this makes no sense, since
>> we are sure that the scaling and performance (1 RANK - 1 GPU) was going
>> well with the old version.
>>
>> 2) Is it true that the line "GPU backend is enabled" was added in 2024.2?
>>
>>
>> Thank you for any help,
>>
>> Daniele
>>
>>
>>
>>
>>
>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/24f386a8-3b0c-46cd-8bd7-683073cc99c9n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241108/476bc685/attachment.htm>
More information about the CP2K-user
mailing list