[CP2K-user] [CP2K:20806] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types

Frederick Stein f.stein at hzdr.de
Wed Oct 23 07:15:33 UTC 2024


Dear Bartosz,
My fix is merged. Can you switch to the CP2K master and try it again? We 
are still working on a few issues with the Intel compilers such that we may 
eventually migrate from ifort to ifx.
Best,
Frederick

bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:

> Great! Thank you for your help. 
>
> Best
> Bartosz
>
> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein napisał(a):
>
>> I have a fix for it. In contrast to my first thought, it is a case of 
>> invalid type conversion from real to complex numbers (yes, Fortran is 
>> rather strict about it) in pw_derive. This may also be present in a few 
>> other spots. I am currently running more tests and I will open a pull 
>> request within the next few days.
>> Best,
>> Frederick
>>
>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 UTC+2:
>>
>>> I can reproduce the error locally. I am investigating it now.
>>>
>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>>
>>>> I was loading it as it was needed for compilation. I have unloaded the 
>>>> module, but the error still occurs: 
>>>>
>>>> ```
>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>> CLX/DP      TRY    JIT    STA    COL
>>>>    0..13      2      2      0      0 
>>>>   14..23      0      0      0      0 
>>>>   24..64      0      0      0      0 
>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>> Command (PID=15485): 
>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>> H2O-9.inp -o H2O-9.out
>>>> Uptime: 1.757102 s
>>>>
>>>>
>>>>
>>>> ===================================================================================
>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>> =   RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>
>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>
>>>> ===================================================================================
>>>>
>>>>
>>>> ===================================================================================
>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>> =   RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>
>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>
>>>> ===================================================================================
>>>> ```
>>>>
>>>>
>>>> and the last 100 lines:
>>>>
>>>> ```
>>>>  000000:000002>>                            11     37 pw_create_c1d     
>>>>   start 
>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11     37 pw_create_c1d     
>>>>   0.000 
>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                         10     64 pw_pool_create_pw   
>>>>     0.000
>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                         10     25 pw_copy       start 
>>>> Hostmem: 
>>>>  697 MB GPUmem: 0 MB
>>>>  000000:000002<<                         10     25 pw_copy       0.001 
>>>> Hostmem: 
>>>>  697 MB GPUmem: 0 MB
>>>>  000000:000002>>                         10     17 pw_axpy       start 
>>>> Hostmem: 
>>>>  697 MB GPUmem: 0 MB
>>>>  000000:000002<<                         10     17 pw_axpy       0.001 
>>>> Hostmem: 
>>>>  697 MB GPUmem: 0 MB
>>>>  000000:000002>>                         10     19 mp_sum_d       start 
>>>> Hostmem:
>>>>   697 MB GPUmem: 0 MB
>>>>  000000:000002<<                         10     19 mp_sum_d       0.000 
>>>> Hostmem:
>>>>   697 MB GPUmem: 0 MB
>>>>  000000:000002>>                         10      3 pw_poisson_solve     
>>>>   start 
>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11      3 
>>>> pw_poisson_rebuild       s
>>>>  tart Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11      3 
>>>> pw_poisson_rebuild       0
>>>>  .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11     65 pw_pool_create_pw 
>>>>       st
>>>>  art Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     38 pw_create_c1d 
>>>>       sta
>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     38 pw_create_c1d 
>>>>       0.0
>>>>  00 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11     65 pw_pool_create_pw 
>>>>       0.
>>>>  000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11     26 pw_copy       
>>>> start Hostme
>>>>  m: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11     26 pw_copy       
>>>> 0.001 Hostme
>>>>  m: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11      3 pw_multiply_with 
>>>>       sta
>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11      3 pw_multiply_with 
>>>>       0.0
>>>>  01 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11     27 pw_copy       
>>>> start Hostme
>>>>  m: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11     27 pw_copy       
>>>> 0.001 Hostme
>>>>  m: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11      3 pw_integral_ab   
>>>>     start
>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     20 mp_sum_d       
>>>> start Ho
>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     20 mp_sum_d       
>>>> 0.001 Ho
>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                            11      3 pw_integral_ab   
>>>>     0.004
>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                            11      4 pw_poisson_set   
>>>>     start
>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     66 
>>>> pw_pool_create_pw      
>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                                  13     39 
>>>> pw_create_c1d       
>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                                  13     39 
>>>> pw_create_c1d       
>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     66 
>>>> pw_pool_create_pw      
>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     28 pw_copy       
>>>> start Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     28 pw_copy       
>>>> 0.001 Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12      7 pw_derive     
>>>>   start H
>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12      7 pw_derive     
>>>>   0.002 H
>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     67 
>>>> pw_pool_create_pw      
>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                                  13     40 
>>>> pw_create_c1d       
>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                                  13     40 
>>>> pw_create_c1d       
>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     67 
>>>> pw_pool_create_pw      
>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     29 pw_copy       
>>>> start Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     29 pw_copy       
>>>> 0.001 Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12      8 pw_derive     
>>>>   start H
>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12      8 pw_derive     
>>>>   0.002 H
>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     68 
>>>> pw_pool_create_pw      
>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                                  13     41 
>>>> pw_create_c1d       
>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                                  13     41 
>>>> pw_create_c1d       
>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     68 
>>>> pw_pool_create_pw      
>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12     30 pw_copy       
>>>> start Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002<<                               12     30 pw_copy       
>>>> 0.001 Hos
>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>  000000:000002>>                               12      9 pw_derive     
>>>>   start H
>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>  ```
>>>>
>>>> This is the list of currently loaded modules (all come with intel):
>>>>
>>>> ```
>>>> Currently Loaded Modulefiles:
>>>>  1) GCCcore/13.3.0                  7) 
>>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                   
>>>>          
>>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                     
>>>>          
>>>>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a   
>>>>         
>>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                     
>>>>          
>>>>  6) UCX/1.16.0-GCCcore-13.3.0    
>>>> ```
>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein 
>>>> napisał(a):
>>>>
>>>>> Dear Bartosz,
>>>>> I am currently running some tests with the latest Intel compiler 
>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why 
>>>>> is it loaded? Can you unload it? This would at least reduce potential 
>>>>> interferences with between the Intel and the GCC compilers.
>>>>> Best,
>>>>> Frederick
>>>>>
>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>>
>>>>>> The error for ssmp is:
>>>>>>
>>>>>> ```
>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>    0..13      4      4      0      0 
>>>>>>   14..23      0      0      0      0 
>>>>>>   24..64      0      0      0      0 
>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>> Command (PID=54845): 
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>> H2O-9.inp -o H2O-9.out
>>>>>> Uptime: 2.861583 s
>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 54845 
>>>>>> Segmentation fault      (core dumped) 
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>> H2O-9.inp -o H2O-9.out
>>>>>> ```
>>>>>>
>>>>>> and the last 100 lines of output:
>>>>>>
>>>>>> ```
>>>>>>  000000:000001>>                               12     20 mp_sum_d     
>>>>>>   start Ho
>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12     20 mp_sum_d     
>>>>>>   0.000 Ho
>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                            11     13 dbcsr_dot_sd   
>>>>>>     0.000 H
>>>>>>  ostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                         10     12 
>>>>>> calculate_ptrace_kp       0.0
>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                       9      6 
>>>>>> evaluate_core_matrix_traces     
>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                       9      6 rebuild_ks_matrix     
>>>>>>   start Ho
>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                         10      6 
>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>        start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                            11    140 
>>>>>> pw_pool_create_pw       st
>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12     79 
>>>>>> pw_create_c1d       sta
>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12     79 
>>>>>> pw_create_c1d       0.0
>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                            11    140 
>>>>>> pw_pool_create_pw       0.
>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                            11    141 
>>>>>> pw_pool_create_pw       st
>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12     80 
>>>>>> pw_create_c1d       sta
>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12     80 
>>>>>> pw_create_c1d       0.0
>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                            11    141 
>>>>>> pw_pool_create_pw       0.
>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                            11     61 pw_copy       
>>>>>> start Hostme
>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                            11     61 pw_copy       
>>>>>> 0.004 Hostme
>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                            11     35 pw_axpy       
>>>>>> start Hostme
>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                            11     35 pw_axpy       
>>>>>> 0.002 Hostme
>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                            11      6 
>>>>>> pw_poisson_solve       sta
>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12      6 
>>>>>> pw_poisson_rebuild     
>>>>>>    start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12      6 
>>>>>> pw_poisson_rebuild     
>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12    142 
>>>>>> pw_pool_create_pw      
>>>>>>   start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13     81 
>>>>>> pw_create_c1d       
>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13     81 
>>>>>> pw_create_c1d       
>>>>>>  0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12    142 
>>>>>> pw_pool_create_pw      
>>>>>>   0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12     62 pw_copy     
>>>>>>   start Hos
>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12     62 pw_copy     
>>>>>>   0.003 Hos
>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12      6 
>>>>>> pw_multiply_with       
>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12      6 
>>>>>> pw_multiply_with       
>>>>>>  0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12     63 pw_copy     
>>>>>>   start Hos
>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12     63 pw_copy     
>>>>>>   0.003 Hos
>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12      6 
>>>>>> pw_integral_ab       st
>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                               12      6 
>>>>>> pw_integral_ab       0.
>>>>>>  005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                               12      7 
>>>>>> pw_poisson_set       st
>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13    143 
>>>>>> pw_pool_create_pw   
>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                     14     82 
>>>>>> pw_create_c1d    
>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                     14     82 
>>>>>> pw_create_c1d    
>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13    143 
>>>>>> pw_pool_create_pw   
>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13     64 pw_copy   
>>>>>>     start 
>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13     64 pw_copy   
>>>>>>     0.003 
>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13     16 pw_derive 
>>>>>>       star
>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13     16 pw_derive 
>>>>>>       0.00
>>>>>>  6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13    144 
>>>>>> pw_pool_create_pw   
>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                     14     83 
>>>>>> pw_create_c1d    
>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                     14     83 
>>>>>> pw_create_c1d    
>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13    144 
>>>>>> pw_pool_create_pw   
>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13     65 pw_copy   
>>>>>>     start 
>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001<<                                  13     65 pw_copy   
>>>>>>     0.004 
>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>  000000:000001>>                                  13     17 pw_derive 
>>>>>>       star
>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>> ```
>>>>>>
>>>>>> for psmp the last 100 lines is:
>>>>>>
>>>>>> ```
>>>>>>  000000:000002<<                       9      7 
>>>>>> evaluate_core_matrix_traces     
>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                       9      7 rebuild_ks_matrix     
>>>>>>   start Ho
>>>>>>
>>>>>>  stmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                         10      7 
>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>        start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11    164 
>>>>>> pw_pool_create_pw       st
>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     93 
>>>>>> pw_create_c1d       sta
>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     93 
>>>>>> pw_create_c1d       0.0
>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11    164 
>>>>>> pw_pool_create_pw       0.
>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11    165 
>>>>>> pw_pool_create_pw       st
>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     94 
>>>>>> pw_create_c1d       sta
>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     94 
>>>>>> pw_create_c1d       0.0
>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11    165 
>>>>>> pw_pool_create_pw       0.
>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     73 pw_copy       
>>>>>> start Hostme
>>>>>>
>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     73 pw_copy       
>>>>>> 0.001 Hostme
>>>>>>
>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     41 pw_axpy       
>>>>>> start Hostme
>>>>>>
>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     41 pw_axpy       
>>>>>> 0.001 Hostme
>>>>>>
>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     52 mp_sum_d       
>>>>>> start Hostm
>>>>>>
>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     52 mp_sum_d       
>>>>>> 0.000 Hostm
>>>>>>
>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11      7 
>>>>>> pw_poisson_solve       sta
>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      7 
>>>>>> pw_poisson_rebuild     
>>>>>>    start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12      7 
>>>>>> pw_poisson_rebuild     
>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12    166 
>>>>>> pw_pool_create_pw      
>>>>>>
>>>>>>   start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     95 
>>>>>> pw_create_c1d       
>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     95 
>>>>>> pw_create_c1d       
>>>>>>  0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12    166 
>>>>>> pw_pool_create_pw      
>>>>>>
>>>>>>   0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     74 pw_copy     
>>>>>>   start Hos
>>>>>>
>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     74 pw_copy     
>>>>>>   0.001 Hos
>>>>>>
>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      7 
>>>>>> pw_multiply_with       
>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12      7 
>>>>>> pw_multiply_with       
>>>>>>  0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     75 pw_copy     
>>>>>>   start Hos
>>>>>>
>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     75 pw_copy     
>>>>>>   0.001 Hos
>>>>>>
>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      7 
>>>>>> pw_integral_ab       st
>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     53 mp_sum_d 
>>>>>>       start
>>>>>>
>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     53 mp_sum_d 
>>>>>>       0.000
>>>>>>
>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12      7 
>>>>>> pw_integral_ab       0.
>>>>>>  003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      8 
>>>>>> pw_poisson_set       st
>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13    167 
>>>>>> pw_pool_create_pw   
>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                     14     96 
>>>>>> pw_create_c1d    
>>>>>>
>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                     14     96 
>>>>>> pw_create_c1d    
>>>>>>
>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13    167 
>>>>>> pw_pool_create_pw   
>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     76 pw_copy   
>>>>>>     start 
>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     76 pw_copy   
>>>>>>     0.001 
>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     19 pw_derive 
>>>>>>       star
>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     19 pw_derive 
>>>>>>       0.00
>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13    168 
>>>>>> pw_pool_create_pw   
>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                     14     97 
>>>>>> pw_create_c1d    
>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                     14     97 
>>>>>> pw_create_c1d    
>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13    168 
>>>>>> pw_pool_create_pw   
>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     77 pw_copy   
>>>>>>     start 
>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     77 pw_copy   
>>>>>>     0.001 
>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     20 pw_derive 
>>>>>>       star
>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>> ```
>>>>>>
>>>>>> Thanks
>>>>>> Bartosz
>>>>>>
>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick Stein 
>>>>>> napisał(a):
>>>>>>
>>>>>>> Dear Bartosz,
>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>> Regarding the trace, I do not know either as there is not much that 
>>>>>>> could break in pw_derive (it just performs multiplications) and the 
>>>>>>> sequence of operations is to unspecific. It may be that the code actually 
>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last 
>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces 
>>>>>>> with the psmp version.
>>>>>>> Best,
>>>>>>> Frederick
>>>>>>>
>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 UTC+2:
>>>>>>>
>>>>>>>> The error is:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>    0..13      2      2      0      0
>>>>>>>>   14..23      0      0      0      0
>>>>>>>>
>>>>>>>>   24..64      0      0      0      0
>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>> Command (PID=2607388): 
>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>> Uptime: 5.288243 s
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ===================================================================================
>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>> =   RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>
>>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>
>>>>>>>> ===================================================================================
>>>>>>>>
>>>>>>>>
>>>>>>>> ===================================================================================
>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>> =   RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>
>>>>>>>> ===================================================================================
>>>>>>>> ```
>>>>>>>>
>>>>>>>> and the last 20 lines:
>>>>>>>>
>>>>>>>> ```
>>>>>>>>  000000:000002<<                                  13     76 pw_copy 
>>>>>>>>       0.001
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>> pw_derive       0.00
>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>> pw_pool_create_pw
>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>> pw_create_c1d
>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>> pw_create_c1d
>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>> pw_pool_create_pw
>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     77 pw_copy 
>>>>>>>>       start
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     77 pw_copy 
>>>>>>>>       0.001
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> ```
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein 
>>>>>>>> napisał(a):
>>>>>>>>
>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE keyword 
>>>>>>>>> to the &GLOBAL section and then run the test manually. This increases the 
>>>>>>>>> size of the output file dramatically (to some million lines). Can you send 
>>>>>>>>> me the last ~20 lines of the output?
>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40 
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I 
>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for 
>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp 
>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same 
>>>>>>>>>> setting, I provide example output as attachment. 
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Bartosz
>>>>>>>>>>
>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein 
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 (add 
>>>>>>>>>>> '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of the 
>>>>>>>>>>> ssmp?
>>>>>>>>>>> Best,
>>>>>>>>>>> Frederick
>>>>>>>>>>>
>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 15:37:43 
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>
>>>>>>>>>>>> thanks again for help. So I have tested different simulation 
>>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI 
>>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of 
>>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart from 
>>>>>>>>>>>> the effect on simulation time, they have no significant effect on the 
>>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, time 
>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130; 
>>>>>>>>>>>> 495min
>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94; 
>>>>>>>>>>>> 484min
>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>> 563min
>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>> 556min
>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 511min
>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227; 
>>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> Any ideas what I could do next to have more information about 
>>>>>>>>>>>> the source of the problem or maybe you see a potential solution at this 
>>>>>>>>>>>> stage? I would appreciate any further help. 
>>>>>>>>>>>>
>>>>>>>>>>>> Best
>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick Stein 
>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test do 
>>>>>>>>>>>>> not run that efficiently with such a large number of threads. 2 should be 
>>>>>>>>>>>>> sufficient.
>>>>>>>>>>>>> The test result suggests that most of the functionality may 
>>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to 
>>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests 
>>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 13:48:42 
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/9027c53b-4155-418c-9d08-ea77e5ea5bcfn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241023/3eb597c7/attachment-0001.htm>


More information about the CP2K-user mailing list