[CP2K-user] psmp runtime error with hybrid run combinations (mpi + openmp)
Alfio Lazzaro
alfio.... at gmail.com
Tue Apr 14 06:55:54 UTC 2020
Hi,
So, from what you are saying the problem is the OpenMP parallelization (it
works without).
However, Cholesky is SCALAPACK, so you should check how you are linking to
it.
In particular, CP2K suggested using Sequential BLAS.
Could you share you arch file and how you are compiling CP2K? (which
libraries)
Alfio
Il giorno lunedì 13 aprile 2020 07:03:42 UTC+2, Shivarama Rao ha scritto:
>
> Hi,
>
> I am trying to debug an issue with cp2k 6.1. The popt run is fine. but
> there are errors with psmp runs.
>
> command line used:
>
> * mpirun -np 2 -x OMP_NUM_THREADS=4
> ../../../exe/Linux-x86-64-aocc/cp2k.psmp ./H2O-32.inp*
>
> psmp executable runs fine with 1 mpi rank and 4 openmp threads. but it
> fails with 2 mpi rank and 4 openmp threads.
>
> Following is the error generated with 2 mpi rank and 4 openmp threads. the
> behavior is same for all other input sets like H2O-64.inp, H20-128.inp
> H20-256.inp, H20-1024.inp
>
>
>
> *******************************************************************************
> * ___
> *
> * /
> *
> *[ABORT]
> *
> * ___/ Cholesky decomposition failed. Matrix ill conditioned ?
> *
> * |
> *
> *O/| /
> *
> *| | /
> *
> *
> /home/amd/cp2k_aocc/cp2k-6.1/src/cp_dbcsr_cholesky.F:121 *
>
> *******************************************************************************
>
>
> ===== Routine Calling Stack =====
>
> 12 cp_dbcsr_cholesky_decompose
> 11 qs_ot_get_derivative
> 10 ot_mini
> 9 ot_scf_mini
> 8 qs_scf_loop_do_ot
> 7 qs_scf_new_mos
> 6 scf_env_do_scf_inner_loop
> 5 scf_env_do_scf
> 4 qs_energies
> 3 qs_forces
> 2 qs_mol_dyn_low
> 1 CP2K
>
> following are the combinations where the executable work/ dont work
>
>
>
> MPI processes
>
> OPENMP threads
>
> works/not works
>
> 1
>
> 1
>
> works
>
> 1
>
> 2
>
> works
>
> 1
>
> 4
>
> works
>
> 1
>
> 8
>
> works
>
> 1
>
> 16
>
> works
>
> 1
>
> 32
>
> works
>
> 1
>
> 64
>
> works
>
> 2
>
> 1
>
> works
>
> 2
>
> 2
>
> works
>
> 2
>
> 4
>
> not works
>
> 2
>
> 8
>
> not works
>
> 2
>
> 16
>
> not works
>
> 2
>
> 32
>
> not works
>
> 2
>
> 64
>
> not works
>
> 4
>
> 1
>
> works
>
> 4
>
> 2
>
> works
>
> 4
>
> 4
>
> not works
>
> 8
>
> 1
>
> works
>
> 8
>
> 2
>
> works
>
> 8
>
> 3
>
> works
>
> 8
>
> 4
>
> not works
>
> 16
>
> 1
>
> works
>
> 16
>
> 2
>
> works
>
> 16
>
> 3
>
> not works
>
>
>
> what may be the possible reason for this behavior and what may be the
> right way to debug this issue?. I tried both with openmpi and mpich and
> both give similar results. The compiler is in house compiler.
>
> Thanks for your help,
> Shivarama Rao
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200413/2847a9c5/attachment.htm>
More information about the CP2K-user
mailing list