[CP2K-user] [CP2K:20051] Hybrid MPI+OpenMP is broken in v2024.1?

Eugene roginovicci at gmail.com
Thu Mar 21 16:39:59 UTC 2024


Dear Matthias,

I'm really appreciate for your valuable comments. Your help is very 
fruitful, thank you!

Actually, where is nothing special in output file the calculations 
simple freeze after the message"

HFX_MEM_INFO|  Est. max. program size before HFX [MiB]:                    1283

There is already a mess in different CP2K versions and supporting 
libraries installed, but I'm trying to control the situation...

So before sending a proper output I emerged a test using only MPI 
parallelization and it was always failed when hybrid functional is used. 
But I swear it worked before. So I grabbed dependency libraries form 
version 8 set and feed them to version 2024. I found libint cause the 
problem. According to toolchain scripts the 2.6.0 version is recommended 
with LMAX=5. I took the source from libint-v2.6.0-cp2k-lmax-4.tgz 
traball and compiled with options --enable-fortran --with-pic 
--enable-shared and this combination seems to work in hybrid MPI+OMP 
mode. Although OMP_STACKSIZE requires to be 4 times as much with respect 
to version 8.

Why CP2K is so sensitive to libint in Hybrid functional calculations?


Thank you in advance,
Eugene


On 3/20/24 16:49, Krack Matthias wrote:
>
> Dear Eugene
>
> The cp2k code has been increasingly functionalized with OpenMP 
> directives since version 8 including the loops in the PW part 
> operating with large arrays. By contrast to the GNU compiler, the 
> Intel compiler requires now in most cases an explicit increase of the 
> OMP_STACKSIZE. Your problem with the PBE0 run could have also other 
> reasons. Without an input/output example showing the problem, it is 
> difficult to provide further hints.
>
> Best
>
> Matthias
>
> *From: *cp2k at googlegroups.com <cp2k at googlegroups.com> on behalf of 
> Eugene <roginovicci at gmail.com>
> *Date: *Wednesday, 20 March 2024 at 13:16
> *To: *cp2k at googlegroups.com <cp2k at googlegroups.com>
> *Subject: *Re: [CP2K:20045] Hybrid MPI+OpenMP is broken in v2024.1?
>
> Dear Matthias,
>
> Thank you for advice. This guide me towards localization of the problem.
>
> I have two testing system with the same atomic structure but different 
> functionals, namely common PBE and hybrid PBE0. Both works in hybrid 
> mode in CP2K v8 without changing OMP_STACKSIZE. In new version of CP2K 
> v2024.1 it become very important to set OMP_STACKSIZE=44m. for PBE 
> model. value below 44m will rise segfault. But for PBE0 model there is 
> no reasonable value to finish calculations even 1024m rise segfault.
>
> According to htop statistics I found that new 2024.1 version took more 
> SHM (amount of shared memory used by a task) per process -- approx 1.3 
> times more ~ 135 M.
>
> At the same time in case of old v8 CP2K there is no influence of 
> OMP_STACKSIZE at all. Even very small values doesn't rise error and 
> calculations finished normally.
>
> Is there something related to OMP_STACKSIZE in configuration of CP2K 
> source code. Some kind of limit or -D parameter?
>
> I found related unresolved issue in mailing list:
>
> https://groups.google.com/g/cp2k/c/40Ods3HYW5g
>
> I made some test with different cutoffs as well and in new 2024.1 it 
> always crashed in PBE0.
>
> Best regards,
> Eugene
>
> On 3/19/24 17:02, Krack Matthias wrote:
>
>     HI Eugene
>
>     If you haven’t tried yet, you can set the environment variable
>     OMP_STACKSIZE=64m.
>
>     HTH
>
>     Matthias
>
>     *From: *cp2k at googlegroups.com <cp2k at googlegroups.com>
>     <mailto:cp2k at googlegroups.com> on behalf of Eugene
>     <roginovicci at gmail.com> <mailto:roginovicci at gmail.com>
>     *Date: *Tuesday, 19 March 2024 at 13:43
>     *To: *cp2k <cp2k at googlegroups.com> <mailto:cp2k at googlegroups.com>
>     *Subject: *[CP2K:20043] Hybrid MPI+OpenMP is broken in v2024.1?
>
>     Hi, I've finally compiled CP2K v2024.1 using cmake build system
>     (which was a long story accompanied with cmake modules fixing).
>     Anyway I have two nodes for testing based on xeon 2011 v4
>     processors (-march=broadwell) running on Almalinux 9. I have the
>     following library compiled and installed:
>
>     1. linint 2.6.0 (options are  --enable-fortran  --with-pic 
>     --enable-shared as suggested in toolchain build script)
>
>     2. libxsmm-1.17 (with option INTRINSICS=1)
>
>     3. libxc-6.1.0
>
>     4. dbcsr-2.6.0
>
>     5. Elpa
>
>     Everything was build with Intel oneAPI 2023.0.0 compilator using
>     MKL and intel MPI libraries.
>     The compiled binary cp2k.psmp works quite well in MPI mode
>     (OMP_NUM_THREADS=1), but hybrid mode filed to run properly. A can
>     see the general MPI processe do fire up OMP threads as necessary
>     at the beginning, the calculations run and make initialization
>     unless "SCF WAVEFUNCTION OPTIMIZATION" starts. There is no debug
>     information except message about Segmentation fault which rise
>     termination of mpi process on child node. I spent hours to
>     localize the problem but I'm pretty sure this is not due to node
>     configuration since old v8.2 version do works in hybrid mode even
>     being compiled with older intel compilator.
>
>     Any hints are very welcome,
>     Eugene
>
>     -- 
>     You received this message because you are subscribed to the Google
>     Groups "cp2k" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to cp2k+unsubscribe at googlegroups.com.
>     To view this discussion on the web visit
>     https://groups.google.com/d/msgid/cp2k/52bbc3d3-e857-4f21-a33f-aa42e30d106an%40googlegroups.com
>     <https://groups.google.com/d/msgid/cp2k/52bbc3d3-e857-4f21-a33f-aa42e30d106an%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
>     -- 
>     You received this message because you are subscribed to a topic in
>     the Google Groups "cp2k" group.
>     To unsubscribe from this topic, visit
>     https://groups.google.com/d/topic/cp2k/TFgAsWkpnW0/unsubscribe.
>     To unsubscribe from this group and all its topics, send an email
>     to cp2k+unsubscribe at googlegroups.com.
>     To view this discussion on the web visit
>     https://groups.google.com/d/msgid/cp2k/ZRAP278MB08278F9087D15079913994C1F42C2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM
>     <https://groups.google.com/d/msgid/cp2k/ZRAP278MB08278F9087D15079913994C1F42C2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>.
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/cp2k/b7c23615-29ca-474d-884e-f861eef46093%40gmail.com 
> <https://groups.google.com/d/msgid/cp2k/b7c23615-29ca-474d-884e-f861eef46093%40gmail.com?utm_medium=email&utm_source=footer>.
>
> -- 
> You received this message because you are subscribed to a topic in the 
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/cp2k/TFgAsWkpnW0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/cp2k/ZRAP278MB0827C82560F5067265202C0BF4332%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM 
> <https://groups.google.com/d/msgid/cp2k/ZRAP278MB0827C82560F5067265202C0BF4332%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/7cdacb12-cb1e-4f56-9c1f-268b98e7b77c%40gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20240321/d1c0440e/attachment-0001.htm>


More information about the CP2K-user mailing list