[CP2K-user] [CP2K:20873] CP2K on LUMI
Daniele Passerone
dpasserone at gmail.com
Thu Nov 7 14:11:46 UTC 2024
Dear forum,
Recently the supercomputer LUMI has been upgraded with the LUMI/24.03
software environment.
With the version 23.09 we could run on the GPU partition (8 Gpu per node),
following the prescription:
- *When running on LUMI-G, run using 8 MPI ranks per compute node, where
each rank has access to 1 GPU in the same NUMA zone. This also means that
you have to OMP_NUM_THREADS=6-7 to utilize all CPU cores. Please note that
using all 64 cores will not work as the first core in each CCD is reserved
for the operating system, so that only 56 cores are available.*
The version we use on the old environment 23.09 was
CP2K/2024.1-cpeGNU-23.09-GPU
(easybuild)
Which is described on the LUMI website as
*"CP2K 2024.1 release compiled with AMD GPU support enabled for CP2K itself
and several of the libraries (SpFFT, SpLA). Cray Programming Environment
23.09 used together with the unsupported rocm/5.6.1 module installed by the
LUMI Support Team."*
With the new environment, we are advised to compile accordingly, using
easybuild.
https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/c/CP2K/
The code was compiled (2024.2) , but then DFT SCF steps fail with an error
like that:
*******************************************************************************
* ___
*
* / \
*
* [ABORT]
*
* \___/ G vector not found
*
* |
*
* O/|
*
* /| |
*
* / \ pw/pw_grids.F:1848
*
*******************************************************************************
or during an initial part of the run
*** WARNING in atoms_input.F:123 :: Overwriting coordinates. Active ***
*** coordinates read from &COORD section. Active coordinates READ from ***
*** &COORD section ***
in which the job quits without any error message.
Questions:
1) Is there somebody who can help me understanding why those jobs fail, and
how to properly compile cp2k on lumi?
The LUMI support (Emanuele Vitali) discovered that the newest cp2k version
(with 24.03 environment, CP2K 2024.2) has a line in the output:
DBCSR| ACC: GPU backend is enabled
T (D)
that is not present in the CP2K 2024.1 compiled with 23.09.
So his hypothesis was that the CP2K 2024.1 that was working well was *NOT
using GPU support., and that the problems in 2024.2 24.03 come from trying
to use GPU support. *
In my opinion (and also Marcella Iannuzzi's) this makes no sense, since we
are sure that the scaling and performance (1 RANK - 1 GPU) was going well
with the old version.
2) Is it true that the line "GPU backend is enabled" was added in 2024.2?
Thank you for any help,
Daniele
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/766e0c00-7f2d-406e-b300-bf737d4d94a0n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241107/d2000311/attachment.htm>
More information about the CP2K-user
mailing list