[CP2K-user] [CP2K:20803] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types
Frederick Stein
f.stein at hzdr.de
Tue Oct 22 13:24:03 UTC 2024
I have a fix for it. In contrast to my first thought, it is a case of
invalid type conversion from real to complex numbers (yes, Fortran is
rather strict about it) in pw_derive. This may also be present in a few
other spots. I am currently running more tests and I will open a pull
request within the next few days.
Best,
Frederick
Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 UTC+2:
> I can reproduce the error locally. I am investigating it now.
>
> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>
>> I was loading it as it was needed for compilation. I have unloaded the
>> module, but the error still occurs:
>>
>> ```
>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>> CLX/DP TRY JIT STA COL
>> 0..13 2 2 0 0
>> 14..23 0 0 0 0
>> 24..64 0 0 0 0
>> Registry and code: 13 MB + 16 KB (gemm=2)
>> Command (PID=15485):
>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>> H2O-9.inp -o H2O-9.out
>> Uptime: 1.757102 s
>>
>>
>>
>> ===================================================================================
>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> = RANK 0 PID 15485 RUNNING AT r30c01b01
>>
>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>
>> ===================================================================================
>>
>>
>> ===================================================================================
>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> = RANK 1 PID 15486 RUNNING AT r30c01b01
>>
>> = KILLED BY SIGNAL: 9 (Killed)
>>
>> ===================================================================================
>> ```
>>
>>
>> and the last 100 lines:
>>
>> ```
>> 000000:000002>> 11 37 pw_create_c1d
>> start
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 37 pw_create_c1d
>> 0.000
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 10 64 pw_pool_create_pw
>> 0.000
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 10 25 pw_copy start
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002<< 10 25 pw_copy 0.001
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002>> 10 17 pw_axpy start
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002<< 10 17 pw_axpy 0.001
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002>> 10 19 mp_sum_d start
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002<< 10 19 mp_sum_d 0.000
>> Hostmem:
>> 697 MB GPUmem: 0 MB
>> 000000:000002>> 10 3 pw_poisson_solve
>> start
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 3 pw_poisson_rebuild
>> s
>> tart Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 3 pw_poisson_rebuild
>> 0
>> .000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 65 pw_pool_create_pw
>> st
>> art Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 38 pw_create_c1d
>> sta
>> rt Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 38 pw_create_c1d
>> 0.0
>> 00 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 65 pw_pool_create_pw
>> 0.
>> 000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 26 pw_copy start
>> Hostme
>> m: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 26 pw_copy 0.001
>> Hostme
>> m: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 3 pw_multiply_with
>> sta
>> rt Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 3 pw_multiply_with
>> 0.0
>> 01 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 27 pw_copy start
>> Hostme
>> m: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 27 pw_copy 0.001
>> Hostme
>> m: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 3 pw_integral_ab
>> start
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 20 mp_sum_d
>> start Ho
>> stmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 20 mp_sum_d
>> 0.001 Ho
>> stmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 11 3 pw_integral_ab
>> 0.004
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 11 4 pw_poisson_set
>> start
>> Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 66
>> pw_pool_create_pw
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 13 39 pw_create_c1d
>>
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 13 39 pw_create_c1d
>>
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 66
>> pw_pool_create_pw
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 28 pw_copy
>> start Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 28 pw_copy
>> 0.001 Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 7 pw_derive
>> start H
>> ostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 7 pw_derive
>> 0.002 H
>> ostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 67
>> pw_pool_create_pw
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 13 40 pw_create_c1d
>>
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 13 40 pw_create_c1d
>>
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 67
>> pw_pool_create_pw
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 29 pw_copy
>> start Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 29 pw_copy
>> 0.001 Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 8 pw_derive
>> start H
>> ostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 8 pw_derive
>> 0.002 H
>> ostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 68
>> pw_pool_create_pw
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 13 41 pw_create_c1d
>>
>> start Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 13 41 pw_create_c1d
>>
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 68
>> pw_pool_create_pw
>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 30 pw_copy
>> start Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002<< 12 30 pw_copy
>> 0.001 Hos
>> tmem: 697 MB GPUmem: 0 MB
>> 000000:000002>> 12 9 pw_derive
>> start H
>> ostmem: 697 MB GPUmem: 0 MB
>> ```
>>
>> This is the list of currently loaded modules (all come with intel):
>>
>> ```
>> Currently Loaded Modulefiles:
>> 1) GCCcore/13.3.0 7)
>> impi/2021.13.0-intel-compilers-2024.2.0
>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>
>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>
>> 4) intel-compilers/2024.2.0 10) imkl-FFTW/2024.2.0-iimpi-2024a
>>
>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>
>> 6) UCX/1.16.0-GCCcore-13.3.0
>> ```
>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein napisał(a):
>>
>>> Dear Bartosz,
>>> I am currently running some tests with the latest Intel compiler myself.
>>> What bothers me about your setup is the module GCC13/13.3.0 . Why is it
>>> loaded? Can you unload it? This would at least reduce potential
>>> interferences with between the Intel and the GCC compilers.
>>> Best,
>>> Frederick
>>>
>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>
>>>> The error for ssmp is:
>>>>
>>>> ```
>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>> CLX/DP TRY JIT STA COL
>>>> 0..13 4 4 0 0
>>>> 14..23 0 0 0 0
>>>> 24..64 0 0 0 0
>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>> Command (PID=54845):
>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>> H2O-9.inp -o H2O-9.out
>>>> Uptime: 2.861583 s
>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 54845
>>>> Segmentation fault (core dumped)
>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>> H2O-9.inp -o H2O-9.out
>>>> ```
>>>>
>>>> and the last 100 lines of output:
>>>>
>>>> ```
>>>> 000000:000001>> 12 20 mp_sum_d
>>>> start Ho
>>>> stmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 20 mp_sum_d
>>>> 0.000 Ho
>>>> stmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 11 13 dbcsr_dot_sd
>>>> 0.000 H
>>>> ostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 10 12 calculate_ptrace_kp
>>>> 0.0
>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 9 6
>>>> evaluate_core_matrix_traces
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 9 6 rebuild_ks_matrix
>>>> start Ho
>>>> stmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 10 6
>>>> qs_ks_build_kohn_sham_matrix
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 11 140 pw_pool_create_pw
>>>> st
>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 79 pw_create_c1d
>>>> sta
>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 79 pw_create_c1d
>>>> 0.0
>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 11 140 pw_pool_create_pw
>>>> 0.
>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 11 141 pw_pool_create_pw
>>>> st
>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 80 pw_create_c1d
>>>> sta
>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 80 pw_create_c1d
>>>> 0.0
>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 11 141 pw_pool_create_pw
>>>> 0.
>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 11 61 pw_copy
>>>> start Hostme
>>>> m: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 11 61 pw_copy
>>>> 0.004 Hostme
>>>> m: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 11 35 pw_axpy
>>>> start Hostme
>>>> m: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 11 35 pw_axpy
>>>> 0.002 Hostme
>>>> m: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 11 6 pw_poisson_solve
>>>> sta
>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 6
>>>> pw_poisson_rebuild
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 6
>>>> pw_poisson_rebuild
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 142
>>>> pw_pool_create_pw
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 81
>>>> pw_create_c1d
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 81
>>>> pw_create_c1d
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 142
>>>> pw_pool_create_pw
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 62 pw_copy
>>>> start Hos
>>>> tmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 62 pw_copy
>>>> 0.003 Hos
>>>> tmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 6
>>>> pw_multiply_with
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 6
>>>> pw_multiply_with
>>>> 0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 63 pw_copy
>>>> start Hos
>>>> tmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 63 pw_copy
>>>> 0.003 Hos
>>>> tmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 6 pw_integral_ab
>>>> st
>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 12 6 pw_integral_ab
>>>> 0.
>>>> 005 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 12 7 pw_poisson_set
>>>> st
>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 143
>>>> pw_pool_create_pw
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 14 82
>>>> pw_create_c1d
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 14 82
>>>> pw_create_c1d
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 143
>>>> pw_pool_create_pw
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 64 pw_copy
>>>> start
>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 64 pw_copy
>>>> 0.003
>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 16 pw_derive
>>>> star
>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 16 pw_derive
>>>> 0.00
>>>> 6 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 144
>>>> pw_pool_create_pw
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 14 83
>>>> pw_create_c1d
>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 14 83
>>>> pw_create_c1d
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 144
>>>> pw_pool_create_pw
>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 65 pw_copy
>>>> start
>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001<< 13 65 pw_copy
>>>> 0.004
>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>> 000000:000001>> 13 17 pw_derive
>>>> star
>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>> ```
>>>>
>>>> for psmp the last 100 lines is:
>>>>
>>>> ```
>>>> 000000:000002<< 9 7
>>>> evaluate_core_matrix_traces
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 9 7 rebuild_ks_matrix
>>>> start Ho
>>>>
>>>> stmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 10 7
>>>> qs_ks_build_kohn_sham_matrix
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 164 pw_pool_create_pw
>>>> st
>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 93 pw_create_c1d
>>>> sta
>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 93 pw_create_c1d
>>>> 0.0
>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 11 164 pw_pool_create_pw
>>>> 0.
>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 165 pw_pool_create_pw
>>>> st
>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 94 pw_create_c1d
>>>> sta
>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 94 pw_create_c1d
>>>> 0.0
>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 11 165 pw_pool_create_pw
>>>> 0.
>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 73 pw_copy
>>>> start Hostme
>>>>
>>>> m: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 11 73 pw_copy
>>>> 0.001 Hostme
>>>>
>>>> m: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 41 pw_axpy
>>>> start Hostme
>>>>
>>>> m: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 11 41 pw_axpy
>>>> 0.001 Hostme
>>>>
>>>> m: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 52 mp_sum_d
>>>> start Hostm
>>>>
>>>> em: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 11 52 mp_sum_d
>>>> 0.000 Hostm
>>>>
>>>> em: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 11 7 pw_poisson_solve
>>>> sta
>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 7
>>>> pw_poisson_rebuild
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 7
>>>> pw_poisson_rebuild
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 166
>>>> pw_pool_create_pw
>>>>
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 95
>>>> pw_create_c1d
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 95
>>>> pw_create_c1d
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 166
>>>> pw_pool_create_pw
>>>>
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 74 pw_copy
>>>> start Hos
>>>>
>>>> tmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 74 pw_copy
>>>> 0.001 Hos
>>>>
>>>> tmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 7
>>>> pw_multiply_with
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 7
>>>> pw_multiply_with
>>>> 0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 75 pw_copy
>>>> start Hos
>>>>
>>>> tmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 75 pw_copy
>>>> 0.001 Hos
>>>>
>>>> tmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 7 pw_integral_ab
>>>> st
>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 53 mp_sum_d
>>>> start
>>>>
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 53 mp_sum_d
>>>> 0.000
>>>>
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 12 7 pw_integral_ab
>>>> 0.
>>>> 003 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 12 8 pw_poisson_set
>>>> st
>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 167
>>>> pw_pool_create_pw
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 14 96
>>>> pw_create_c1d
>>>>
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 14 96
>>>> pw_create_c1d
>>>>
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 167
>>>> pw_pool_create_pw
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 76 pw_copy
>>>> start
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 76 pw_copy
>>>> 0.001
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 19 pw_derive
>>>> star
>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 19 pw_derive
>>>> 0.00
>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 168
>>>> pw_pool_create_pw
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 14 97
>>>> pw_create_c1d
>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 14 97
>>>> pw_create_c1d
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 168
>>>> pw_pool_create_pw
>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 77 pw_copy
>>>> start
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002<< 13 77 pw_copy
>>>> 0.001
>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>> 000000:000002>> 13 20 pw_derive
>>>> star
>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>> ```
>>>>
>>>> Thanks
>>>> Bartosz
>>>>
>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick Stein
>>>> napisał(a):
>>>>
>>>>> Dear Bartosz,
>>>>> I have no idea about the issue with LibXSMM.
>>>>> Regarding the trace, I do not know either as there is not much that
>>>>> could break in pw_derive (it just performs multiplications) and the
>>>>> sequence of operations is to unspecific. It may be that the code actually
>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last
>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces
>>>>> with the psmp version.
>>>>> Best,
>>>>> Frederick
>>>>>
>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 UTC+2:
>>>>>
>>>>>> The error is:
>>>>>>
>>>>>> ```
>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>> CLX/DP TRY JIT STA COL
>>>>>> 0..13 2 2 0 0
>>>>>> 14..23 0 0 0 0
>>>>>>
>>>>>> 24..64 0 0 0 0
>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>> Command (PID=2607388):
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>> H2O-9.inp -o H2O-9.out
>>>>>> Uptime: 5.288243 s
>>>>>>
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> = RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>
>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>
>>>>>> ===================================================================================
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> = RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>
>>>>>> ===================================================================================
>>>>>> ```
>>>>>>
>>>>>> and the last 20 lines:
>>>>>>
>>>>>> ```
>>>>>> 000000:000002<< 13 76 pw_copy
>>>>>> 0.001
>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 19 pw_derive
>>>>>> star
>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 19 pw_derive
>>>>>> 0.00
>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 168
>>>>>> pw_pool_create_pw
>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 14 97
>>>>>> pw_create_c1d
>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 14 97
>>>>>> pw_create_c1d
>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 168
>>>>>> pw_pool_create_pw
>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 77 pw_copy
>>>>>> start
>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 77 pw_copy
>>>>>> 0.001
>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 20 pw_derive
>>>>>> star
>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>> ```
>>>>>>
>>>>>> Thanks!
>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein
>>>>>> napisał(a):
>>>>>>
>>>>>>> Please pick one of the failing tests. Then, add the TRACE keyword to
>>>>>>> the &GLOBAL section and then run the test manually. This increases the size
>>>>>>> of the output file dramatically (to some million lines). Can you send me
>>>>>>> the last ~20 lines of the output?
>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40 UTC+2:
>>>>>>>
>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I assume
>>>>>>>> it makes no difference. As I mentioned in previous message for
>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp
>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same
>>>>>>>> setting, I provide example output as attachment.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Bartosz
>>>>>>>>
>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein
>>>>>>>> napisał(a):
>>>>>>>>
>>>>>>>>> Dear Bartosz,
>>>>>>>>> What happens if you set the number of OpenMP threads to 1 (add
>>>>>>>>> '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of the
>>>>>>>>> ssmp?
>>>>>>>>> Best,
>>>>>>>>> Frederick
>>>>>>>>>
>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 15:37:43
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> Hi Frederick,
>>>>>>>>>>
>>>>>>>>>> thanks again for help. So I have tested different simulation
>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI
>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of
>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart from
>>>>>>>>>> the effect on simulation time, they have no significant effect on the
>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, time
>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> and psmp:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130;
>>>>>>>>>> 495min
>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94; 484min
>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74; 563min
>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74; 556min
>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 511min
>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227;
>>>>>>>>>> failed: 98; 263min
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> Any ideas what I could do next to have more information about the
>>>>>>>>>> source of the problem or maybe you see a potential solution at this stage?
>>>>>>>>>> I would appreciate any further help.
>>>>>>>>>>
>>>>>>>>>> Best
>>>>>>>>>> Bartosz
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick Stein
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test do not
>>>>>>>>>>> run that efficiently with such a large number of threads. 2 should be
>>>>>>>>>>> sufficient.
>>>>>>>>>>> The test result suggests that most of the functionality may work
>>>>>>>>>>> but due to a missing backtrace (or similar information), it is hard to tell
>>>>>>>>>>> why they fail. You could also try to run some of the single-node tests to
>>>>>>>>>>> assess the stability of CP2K.
>>>>>>>>>>> Best,
>>>>>>>>>>> Frederick
>>>>>>>>>>>
>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 13:48:42
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>
>>>>>>>>>>>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/024be246-9696-4577-a330-f5a234dc51edn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241022/9b38a484/attachment-0001.htm>
More information about the CP2K-user
mailing list