[CP2K-user] [CP2K:20803] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types

bartosz mazur bamaz.97 at gmail.com
Tue Oct 22 15:39:35 UTC 2024


Great! Thank you for your help. 

Best
Bartosz

wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein napisał(a):

> I have a fix for it. In contrast to my first thought, it is a case of 
> invalid type conversion from real to complex numbers (yes, Fortran is 
> rather strict about it) in pw_derive. This may also be present in a few 
> other spots. I am currently running more tests and I will open a pull 
> request within the next few days.
> Best,
> Frederick
>
> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 UTC+2:
>
>> I can reproduce the error locally. I am investigating it now.
>>
>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>
>>> I was loading it as it was needed for compilation. I have unloaded the 
>>> module, but the error still occurs: 
>>>
>>> ```
>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>> CLX/DP      TRY    JIT    STA    COL
>>>    0..13      2      2      0      0 
>>>   14..23      0      0      0      0 
>>>   24..64      0      0      0      0 
>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>> Command (PID=15485): 
>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>> H2O-9.inp -o H2O-9.out
>>> Uptime: 1.757102 s
>>>
>>>
>>>
>>> ===================================================================================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   RANK 0 PID 15485 RUNNING AT r30c01b01
>>>
>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>
>>> ===================================================================================
>>>
>>>
>>> ===================================================================================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   RANK 1 PID 15486 RUNNING AT r30c01b01
>>>
>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>
>>> ===================================================================================
>>> ```
>>>
>>>
>>> and the last 100 lines:
>>>
>>> ```
>>>  000000:000002>>                            11     37 pw_create_c1d     
>>>   start 
>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11     37 pw_create_c1d     
>>>   0.000 
>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                         10     64 pw_pool_create_pw     
>>>   0.000
>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                         10     25 pw_copy       start 
>>> Hostmem: 
>>>  697 MB GPUmem: 0 MB
>>>  000000:000002<<                         10     25 pw_copy       0.001 
>>> Hostmem: 
>>>  697 MB GPUmem: 0 MB
>>>  000000:000002>>                         10     17 pw_axpy       start 
>>> Hostmem: 
>>>  697 MB GPUmem: 0 MB
>>>  000000:000002<<                         10     17 pw_axpy       0.001 
>>> Hostmem: 
>>>  697 MB GPUmem: 0 MB
>>>  000000:000002>>                         10     19 mp_sum_d       start 
>>> Hostmem:
>>>   697 MB GPUmem: 0 MB
>>>  000000:000002<<                         10     19 mp_sum_d       0.000 
>>> Hostmem:
>>>   697 MB GPUmem: 0 MB
>>>  000000:000002>>                         10      3 pw_poisson_solve     
>>>   start 
>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11      3 pw_poisson_rebuild 
>>>       s
>>>  tart Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11      3 pw_poisson_rebuild 
>>>       0
>>>  .000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11     65 pw_pool_create_pw 
>>>       st
>>>  art Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     38 pw_create_c1d   
>>>     sta
>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     38 pw_create_c1d   
>>>     0.0
>>>  00 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11     65 pw_pool_create_pw 
>>>       0.
>>>  000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11     26 pw_copy       
>>> start Hostme
>>>  m: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11     26 pw_copy       
>>> 0.001 Hostme
>>>  m: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11      3 pw_multiply_with   
>>>     sta
>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11      3 pw_multiply_with   
>>>     0.0
>>>  01 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11     27 pw_copy       
>>> start Hostme
>>>  m: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11     27 pw_copy       
>>> 0.001 Hostme
>>>  m: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11      3 pw_integral_ab     
>>>   start
>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     20 mp_sum_d       
>>> start Ho
>>>  stmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     20 mp_sum_d       
>>> 0.001 Ho
>>>  stmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                            11      3 pw_integral_ab     
>>>   0.004
>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                            11      4 pw_poisson_set     
>>>   start
>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     66 
>>> pw_pool_create_pw      
>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                                  13     39 
>>> pw_create_c1d       
>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                                  13     39 
>>> pw_create_c1d       
>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     66 
>>> pw_pool_create_pw      
>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     28 pw_copy       
>>> start Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     28 pw_copy       
>>> 0.001 Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12      7 pw_derive       
>>> start H
>>>  ostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12      7 pw_derive       
>>> 0.002 H
>>>  ostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     67 
>>> pw_pool_create_pw      
>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                                  13     40 
>>> pw_create_c1d       
>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                                  13     40 
>>> pw_create_c1d       
>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     67 
>>> pw_pool_create_pw      
>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     29 pw_copy       
>>> start Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     29 pw_copy       
>>> 0.001 Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12      8 pw_derive       
>>> start H
>>>  ostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12      8 pw_derive       
>>> 0.002 H
>>>  ostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     68 
>>> pw_pool_create_pw      
>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                                  13     41 
>>> pw_create_c1d       
>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                                  13     41 
>>> pw_create_c1d       
>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     68 
>>> pw_pool_create_pw      
>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12     30 pw_copy       
>>> start Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002<<                               12     30 pw_copy       
>>> 0.001 Hos
>>>  tmem: 697 MB GPUmem: 0 MB
>>>  000000:000002>>                               12      9 pw_derive       
>>> start H
>>>  ostmem: 697 MB GPUmem: 0 MB
>>>  ```
>>>
>>> This is the list of currently loaded modules (all come with intel):
>>>
>>> ```
>>> Currently Loaded Modulefiles:
>>>  1) GCCcore/13.3.0                  7) 
>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                     
>>>        
>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                       
>>>        
>>>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a   
>>>         
>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                       
>>>        
>>>  6) UCX/1.16.0-GCCcore-13.3.0    
>>> ```
>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein napisał(a):
>>>
>>>> Dear Bartosz,
>>>> I am currently running some tests with the latest Intel compiler 
>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why 
>>>> is it loaded? Can you unload it? This would at least reduce potential 
>>>> interferences with between the Intel and the GCC compilers.
>>>> Best,
>>>> Frederick
>>>>
>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>
>>>>> The error for ssmp is:
>>>>>
>>>>> ```
>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>    0..13      4      4      0      0 
>>>>>   14..23      0      0      0      0 
>>>>>   24..64      0      0      0      0 
>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>> Command (PID=54845): 
>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>> H2O-9.inp -o H2O-9.out
>>>>> Uptime: 2.861583 s
>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 54845 
>>>>> Segmentation fault      (core dumped) 
>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>> H2O-9.inp -o H2O-9.out
>>>>> ```
>>>>>
>>>>> and the last 100 lines of output:
>>>>>
>>>>> ```
>>>>>  000000:000001>>                               12     20 mp_sum_d     
>>>>>   start Ho
>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12     20 mp_sum_d     
>>>>>   0.000 Ho
>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                            11     13 dbcsr_dot_sd     
>>>>>   0.000 H
>>>>>  ostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                         10     12 calculate_ptrace_kp 
>>>>>       0.0
>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                       9      6 
>>>>> evaluate_core_matrix_traces     
>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                       9      6 rebuild_ks_matrix     
>>>>>   start Ho
>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                         10      6 
>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>        start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                            11    140 
>>>>> pw_pool_create_pw       st
>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12     79 pw_create_c1d 
>>>>>       sta
>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12     79 pw_create_c1d 
>>>>>       0.0
>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                            11    140 
>>>>> pw_pool_create_pw       0.
>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                            11    141 
>>>>> pw_pool_create_pw       st
>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12     80 pw_create_c1d 
>>>>>       sta
>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12     80 pw_create_c1d 
>>>>>       0.0
>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                            11    141 
>>>>> pw_pool_create_pw       0.
>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                            11     61 pw_copy       
>>>>> start Hostme
>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                            11     61 pw_copy       
>>>>> 0.004 Hostme
>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                            11     35 pw_axpy       
>>>>> start Hostme
>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                            11     35 pw_axpy       
>>>>> 0.002 Hostme
>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                            11      6 pw_poisson_solve 
>>>>>       sta
>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12      6 
>>>>> pw_poisson_rebuild     
>>>>>    start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12      6 
>>>>> pw_poisson_rebuild     
>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12    142 
>>>>> pw_pool_create_pw      
>>>>>   start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13     81 
>>>>> pw_create_c1d       
>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13     81 
>>>>> pw_create_c1d       
>>>>>  0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12    142 
>>>>> pw_pool_create_pw      
>>>>>   0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12     62 pw_copy       
>>>>> start Hos
>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12     62 pw_copy       
>>>>> 0.003 Hos
>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12      6 
>>>>> pw_multiply_with       
>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12      6 
>>>>> pw_multiply_with       
>>>>>  0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12     63 pw_copy       
>>>>> start Hos
>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12     63 pw_copy       
>>>>> 0.003 Hos
>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12      6 
>>>>> pw_integral_ab       st
>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                               12      6 
>>>>> pw_integral_ab       0.
>>>>>  005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                               12      7 
>>>>> pw_poisson_set       st
>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13    143 
>>>>> pw_pool_create_pw   
>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                     14     82 
>>>>> pw_create_c1d    
>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                     14     82 
>>>>> pw_create_c1d    
>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13    143 
>>>>> pw_pool_create_pw   
>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13     64 pw_copy   
>>>>>     start 
>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13     64 pw_copy   
>>>>>     0.003 
>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13     16 pw_derive 
>>>>>       star
>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13     16 pw_derive 
>>>>>       0.00
>>>>>  6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13    144 
>>>>> pw_pool_create_pw   
>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                     14     83 
>>>>> pw_create_c1d    
>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                     14     83 
>>>>> pw_create_c1d    
>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13    144 
>>>>> pw_pool_create_pw   
>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13     65 pw_copy   
>>>>>     start 
>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001<<                                  13     65 pw_copy   
>>>>>     0.004 
>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>  000000:000001>>                                  13     17 pw_derive 
>>>>>       star
>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>> ```
>>>>>
>>>>> for psmp the last 100 lines is:
>>>>>
>>>>> ```
>>>>>  000000:000002<<                       9      7 
>>>>> evaluate_core_matrix_traces     
>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                       9      7 rebuild_ks_matrix     
>>>>>   start Ho
>>>>>
>>>>>  stmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10      7 
>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>        start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11    164 
>>>>> pw_pool_create_pw       st
>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12     93 pw_create_c1d 
>>>>>       sta
>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12     93 pw_create_c1d 
>>>>>       0.0
>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11    164 
>>>>> pw_pool_create_pw       0.
>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11    165 
>>>>> pw_pool_create_pw       st
>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12     94 pw_create_c1d 
>>>>>       sta
>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12     94 pw_create_c1d 
>>>>>       0.0
>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11    165 
>>>>> pw_pool_create_pw       0.
>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     73 pw_copy       
>>>>> start Hostme
>>>>>
>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     73 pw_copy       
>>>>> 0.001 Hostme
>>>>>
>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     41 pw_axpy       
>>>>> start Hostme
>>>>>
>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     41 pw_axpy       
>>>>> 0.001 Hostme
>>>>>
>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     52 mp_sum_d       
>>>>> start Hostm
>>>>>
>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     52 mp_sum_d       
>>>>> 0.000 Hostm
>>>>>
>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      7 pw_poisson_solve 
>>>>>       sta
>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      7 
>>>>> pw_poisson_rebuild     
>>>>>    start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      7 
>>>>> pw_poisson_rebuild     
>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12    166 
>>>>> pw_pool_create_pw      
>>>>>
>>>>>   start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     95 
>>>>> pw_create_c1d       
>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13     95 
>>>>> pw_create_c1d       
>>>>>  0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12    166 
>>>>> pw_pool_create_pw      
>>>>>
>>>>>   0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12     74 pw_copy       
>>>>> start Hos
>>>>>
>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12     74 pw_copy       
>>>>> 0.001 Hos
>>>>>
>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      7 
>>>>> pw_multiply_with       
>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      7 
>>>>> pw_multiply_with       
>>>>>  0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12     75 pw_copy       
>>>>> start Hos
>>>>>
>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12     75 pw_copy       
>>>>> 0.001 Hos
>>>>>
>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      7 
>>>>> pw_integral_ab       st
>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     53 mp_sum_d   
>>>>>     start
>>>>>
>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13     53 mp_sum_d   
>>>>>     0.000
>>>>>
>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      7 
>>>>> pw_integral_ab       0.
>>>>>  003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      8 
>>>>> pw_poisson_set       st
>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13    167 
>>>>> pw_pool_create_pw   
>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                     14     96 
>>>>> pw_create_c1d    
>>>>>
>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                     14     96 
>>>>> pw_create_c1d    
>>>>>
>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13    167 
>>>>> pw_pool_create_pw   
>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     76 pw_copy   
>>>>>     start 
>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13     76 pw_copy   
>>>>>     0.001 
>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     19 pw_derive 
>>>>>       star
>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13     19 pw_derive 
>>>>>       0.00
>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13    168 
>>>>> pw_pool_create_pw   
>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                     14     97 
>>>>> pw_create_c1d    
>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                     14     97 
>>>>> pw_create_c1d    
>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13    168 
>>>>> pw_pool_create_pw   
>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     77 pw_copy   
>>>>>     start 
>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13     77 pw_copy   
>>>>>     0.001 
>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13     20 pw_derive 
>>>>>       star
>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>> ```
>>>>>
>>>>> Thanks
>>>>> Bartosz
>>>>>
>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick Stein 
>>>>> napisał(a):
>>>>>
>>>>>> Dear Bartosz,
>>>>>> I have no idea about the issue with LibXSMM.
>>>>>> Regarding the trace, I do not know either as there is not much that 
>>>>>> could break in pw_derive (it just performs multiplications) and the 
>>>>>> sequence of operations is to unspecific. It may be that the code actually 
>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last 
>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces 
>>>>>> with the psmp version.
>>>>>> Best,
>>>>>> Frederick
>>>>>>
>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 UTC+2:
>>>>>>
>>>>>>> The error is:
>>>>>>>
>>>>>>> ```
>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>    0..13      2      2      0      0
>>>>>>>   14..23      0      0      0      0
>>>>>>>
>>>>>>>   24..64      0      0      0      0
>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>> Command (PID=2607388): 
>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>> Uptime: 5.288243 s
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> =   RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>
>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> =   RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> ```
>>>>>>>
>>>>>>> and the last 20 lines:
>>>>>>>
>>>>>>> ```
>>>>>>>  000000:000002<<                                  13     76 pw_copy 
>>>>>>>       0.001
>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     19 
>>>>>>> pw_derive       star
>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13     19 
>>>>>>> pw_derive       0.00
>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13    168 
>>>>>>> pw_pool_create_pw
>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                     14     97 
>>>>>>> pw_create_c1d
>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                     14     97 
>>>>>>> pw_create_c1d
>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13    168 
>>>>>>> pw_pool_create_pw
>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     77 pw_copy 
>>>>>>>       start
>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13     77 pw_copy 
>>>>>>>       0.001
>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     20 
>>>>>>> pw_derive       star
>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>> ```
>>>>>>>
>>>>>>> Thanks!
>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein 
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> Please pick one of the failing tests. Then, add the TRACE keyword 
>>>>>>>> to the &GLOBAL section and then run the test manually. This increases the 
>>>>>>>> size of the output file dramatically (to some million lines). Can you send 
>>>>>>>> me the last ~20 lines of the output?
>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40 
>>>>>>>> UTC+2:
>>>>>>>>
>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I assume 
>>>>>>>>> it makes no difference. As I mentioned in previous message for 
>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp 
>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same 
>>>>>>>>> setting, I provide example output as attachment. 
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Bartosz
>>>>>>>>>
>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein 
>>>>>>>>> napisał(a):
>>>>>>>>>
>>>>>>>>>> Dear Bartosz,
>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 (add 
>>>>>>>>>> '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of the 
>>>>>>>>>> ssmp?
>>>>>>>>>> Best,
>>>>>>>>>> Frederick
>>>>>>>>>>
>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 15:37:43 
>>>>>>>>>> UTC+2:
>>>>>>>>>>
>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>
>>>>>>>>>>> thanks again for help. So I have tested different simulation 
>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI 
>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of 
>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart from 
>>>>>>>>>>> the effect on simulation time, they have no significant effect on the 
>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, time 
>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> and psmp:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130; 
>>>>>>>>>>> 495min
>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94; 484min
>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>> 563min
>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74; 556min
>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 511min
>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227; 
>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Any ideas what I could do next to have more information about 
>>>>>>>>>>> the source of the problem or maybe you see a potential solution at this 
>>>>>>>>>>> stage? I would appreciate any further help. 
>>>>>>>>>>>
>>>>>>>>>>> Best
>>>>>>>>>>> Bartosz
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick Stein 
>>>>>>>>>>> napisał(a):
>>>>>>>>>>>
>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test do 
>>>>>>>>>>>> not run that efficiently with such a large number of threads. 2 should be 
>>>>>>>>>>>> sufficient.
>>>>>>>>>>>> The test result suggests that most of the functionality may 
>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to 
>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests 
>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Frederick
>>>>>>>>>>>>
>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 13:48:42 
>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>
>>>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/f4d5e2d2-12d0-49b0-b7cc-12de19399b7dn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241022/a2fc5100/attachment-0001.htm>


More information about the CP2K-user mailing list