[CP2K-user] [CP2K:20886] CP2K on LUMI (repost )

Daniele Passerone dpasserone at gmail.com
Fri Nov 8 15:41:38 UTC 2024


Dear forum, 

Recently the supercomputer LUMI has been upgraded with the LUMI/24.03 
software environment. 
With the version 23.09 we could run on the GPU partition (8 Gpu per node), 
following the prescription:


   - *When running on LUMI-G, run using 8 MPI ranks per compute node, where 
   each rank has access to 1 GPU in the same NUMA zone. This also means that 
   you have to OMP_NUM_THREADS=6-7 to utilize all CPU cores. Please note that 
   using all 64 cores will not work as the first core in each CCD is reserved 
   for the operating system, so that only 56 cores are available.*

The version we use on the old environment 23.09 was 

CP2K/2024.1-cpeGNU-23.09-GPU


(easybuild)


Which is described on the LUMI website as 


*"CP2K 2024.1 release compiled with AMD GPU support enabled for CP2K itself 
and several of the libraries (SpFFT, SpLA). Cray Programming Environment 
23.09 used together with the unsupported rocm/5.6.1 module installed by the 
LUMI Support Team."*


With the new environment, we are advised to compile accordingly, using 
easybuild. 


https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/


The code was compiled (2024.2) , but then DFT SCF steps fail with an error 
like that:



*******************************************************************************
* ___
*
* / \
*
* [ABORT] 
*
* \___/ G vector not found
*
* | 
*
* O/| 
*
* /| | 
*
* / \ pw/pw_grids.F:1848
*
*******************************************************************************

or during an initial part of the run 


 

 

 *** WARNING in atoms_input.F:123 :: Overwriting coordinates. Active  
  ***    

 *** coordinates read from &COORD section. Active coordinates READ from 
***    

 *** &COORD section                                                     *** 

 

in which the job quits without any error message. 


Questions:


1) Is there somebody who can help me understanding why those jobs fail, and 
how to properly compile cp2k on lumi?


The LUMI support discovered that the newest cp2k version (with 24.03 
environment, CP2K 2024.2) has a line in the output:


 DBCSR| ACC: GPU backend is enabled                                        
    T (D)

that is not present in the CP2K 2024.1 compiled with 23.09. 

So his hypothesis was that the CP2K 2024.1 that was working well was *NOT 
using GPU support., and that the problems in 2024.2 24.03 come from trying 
to use GPU support. *

In my opinion ghis makes no sense, since we are sure that the scaling and 
performance (1 RANK - 1 GPU) was going well with the old version.

2) Is it true that the line "GPU backend is enabled" was added in 2024.2?


Thank you for any help, 

Daniele

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/058700ea-edb2-446c-a54d-3bc5e3904639n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241108/3c2f0833/attachment.htm>


More information about the CP2K-user mailing list