[CP2K-user] psmp runtime error with hybrid run combinations (mpi + openmp)

Shivarama Rao shivar... at gmail.com
Mon Apr 13 05:03:42 UTC 2020


Hi,

I am trying to debug an issue with cp2k 6.1. The popt run is fine. but 
there are errors with psmp runs.

  command line used:

*     mpirun -np 2 -x OMP_NUM_THREADS=4 
../../../exe/Linux-x86-64-aocc/cp2k.psmp ./H2O-32.inp*

psmp executable runs fine with 1 mpi rank and 4 openmp threads. but it 
fails with 2 mpi rank and 4 openmp threads. 

Following is the error generated with 2 mpi rank and 4 openmp threads. the 
behavior is same for all other input sets like H2O-64.inp, H20-128.inp 
H20-256.inp, H20-1024.inp


 *******************************************************************************
 *   ___                                                                    
   *
 *  /                                                                      
    *
 *[ABORT]                                                                  
    *
 * ___/           Cholesky decomposition failed. Matrix ill conditioned ?  
    *
 *  |                                                                      
    *
 *O/|     /                                                                
    *
 *| |     /                                                                
    *
 *                 /home/amd/cp2k_aocc/cp2k-6.1/src/cp_dbcsr_cholesky.F:121 
*
 *******************************************************************************


 ===== Routine Calling Stack =====

           12 cp_dbcsr_cholesky_decompose
           11 qs_ot_get_derivative
           10 ot_mini
            9 ot_scf_mini
            8 qs_scf_loop_do_ot
            7 qs_scf_new_mos
            6 scf_env_do_scf_inner_loop
            5 scf_env_do_scf
            4 qs_energies
            3 qs_forces
            2 qs_mol_dyn_low
            1 CP2K

 following are the combinations where the executable work/ dont work

 

MPI processes

OPENMP threads

works/not works

1

1

works

1

2

works

1

4

works

1

8

works

1

16

works

1

32

works

1

64

works

2

1

works

2

2

works

2

4

not works

2

8

not works

2

16

not works

2

32

not works

2

64

not works

4

1

works

4

2

works

4

4

not works

8

1

works

8

2

works

8

3

works

8

4

not works

16

1

works

16

2

works

16

3

not works

 

what may be the possible reason for this behavior and what may be the right 
way to debug this issue?. I tried both with openmpi and mpich and both give 
similar results. The compiler is in house compiler.

Thanks for your help,
Shivarama Rao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200412/140c49b6/attachment.htm>


More information about the CP2K-user mailing list