[CP2K-user] [CP2K:20046] Hybrid MPI+OpenMP is broken in v2024.1?

Krack Matthias matthias.krack at psi.ch
Wed Mar 20 13:49:50 UTC 2024


Dear Eugene

The cp2k code has been increasingly functionalized with OpenMP directives since version 8 including the loops in the PW part operating with large arrays. By contrast to the GNU compiler, the Intel compiler requires now in most cases an explicit increase of the OMP_STACKSIZE. Your problem with the PBE0 run could have also other reasons. Without an input/output example showing the problem, it is difficult to provide further hints.

Best

Matthias

From: cp2k at googlegroups.com <cp2k at googlegroups.com> on behalf of Eugene <roginovicci at gmail.com>
Date: Wednesday, 20 March 2024 at 13:16
To: cp2k at googlegroups.com <cp2k at googlegroups.com>
Subject: Re: [CP2K:20045] Hybrid MPI+OpenMP is broken in v2024.1?

Dear Matthias,

Thank you for advice. This guide me towards localization of the problem.

I have two testing system with the same atomic structure but different functionals, namely common PBE and hybrid PBE0. Both works in hybrid mode in CP2K v8 without changing OMP_STACKSIZE. In new version of CP2K v2024.1 it become very important to set OMP_STACKSIZE=44m. for PBE model. value below 44m will rise segfault. But for PBE0 model there is no reasonable value to finish calculations even 1024m rise segfault.

According to htop statistics I found that new 2024.1 version took more SHM (amount of shared memory used by a task) per process -- approx 1.3 times more ~ 135 M.

At the same time in case of old v8 CP2K there is no influence of OMP_STACKSIZE at all. Even very small values doesn't rise error and calculations finished normally.

Is there something related to OMP_STACKSIZE in configuration of CP2K source code. Some kind of limit or -D parameter?

I found related unresolved issue in mailing list:

https://groups.google.com/g/cp2k/c/40Ods3HYW5g

I made some test with different cutoffs as well and in new 2024.1 it always crashed in PBE0.

Best regards,
Eugene
On 3/19/24 17:02, Krack Matthias wrote:
HI Eugene

If you haven’t tried yet, you can set the environment variable OMP_STACKSIZE=64m.

HTH

Matthias

From: cp2k at googlegroups.com<mailto:cp2k at googlegroups.com> <cp2k at googlegroups.com><mailto:cp2k at googlegroups.com> on behalf of Eugene <roginovicci at gmail.com><mailto:roginovicci at gmail.com>
Date: Tuesday, 19 March 2024 at 13:43
To: cp2k <cp2k at googlegroups.com><mailto:cp2k at googlegroups.com>
Subject: [CP2K:20043] Hybrid MPI+OpenMP is broken in v2024.1?
Hi, I've finally compiled CP2K v2024.1 using cmake build system (which was a long story accompanied with cmake modules fixing). Anyway I have two nodes for testing based on xeon 2011 v4 processors (-march=broadwell) running on Almalinux 9. I have the following library compiled and installed:
1. linint 2.6.0 (options are  --enable-fortran  --with-pic  --enable-shared as suggested in toolchain build script)
2. libxsmm-1.17 (with option INTRINSICS=1)
3. libxc-6.1.0
4. dbcsr-2.6.0
5. Elpa

Everything was build with Intel oneAPI 2023.0.0 compilator using MKL and intel MPI libraries.
The compiled binary cp2k.psmp works quite well in MPI mode (OMP_NUM_THREADS=1), but hybrid mode filed to run properly. A can see the general MPI processe do fire up OMP threads as necessary at the beginning, the calculations run and make initialization unless "SCF WAVEFUNCTION OPTIMIZATION" starts. There is no debug information except message about Segmentation fault which rise termination of mpi process on child node. I spent hours to localize the problem but I'm pretty sure this is not due to node configuration since old v8.2 version do works in hybrid mode even being compiled with older intel compilator.

Any hints are very welcome,
Eugene

--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com<mailto:cp2k+unsubscribe at googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/52bbc3d3-e857-4f21-a33f-aa42e30d106an%40googlegroups.com<https://groups.google.com/d/msgid/cp2k/52bbc3d3-e857-4f21-a33f-aa42e30d106an%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to a topic in the Google Groups "cp2k" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cp2k/TFgAsWkpnW0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cp2k+unsubscribe at googlegroups.com<mailto:cp2k+unsubscribe at googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/ZRAP278MB08278F9087D15079913994C1F42C2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM<https://groups.google.com/d/msgid/cp2k/ZRAP278MB08278F9087D15079913994C1F42C2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com<mailto:cp2k+unsubscribe at googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/b7c23615-29ca-474d-884e-f861eef46093%40gmail.com<https://groups.google.com/d/msgid/cp2k/b7c23615-29ca-474d-884e-f861eef46093%40gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/ZRAP278MB0827C82560F5067265202C0BF4332%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20240320/be0b0638/attachment-0001.htm>


More information about the CP2K-user mailing list