[CP2K-user] [CP2K:20903] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types
bartosz mazur
bamaz.97 at gmail.com
Wed Nov 20 14:58:36 UTC 2024
Hi Frederic,
I am writing this as a follow up to previous discussions. I am currently
seeing a recurring problem with CP2K, where tasks are being killed after
about 10 days with errors as in the attached outputs. This is not
particularly annoying, as a restart is sufficient and the simulation can
run on. Unfortunately, I don't think you will be able to reproduce this
error, given the very long simulation time. However, if there is anything
else I can provide to help understand the source of these problems, let me
know.
Best
Bartosz
poniedziałek, 28 października 2024 o 09:34:45 UTC+1 bartosz mazur
napisał(a):
> Many thanks Frederick for your help!
>
> piątek, 25 października 2024 o 14:27:36 UTC+2 Frederick Stein napisał(a):
>
>> Regarding the other issues:
>> I can confirm them but cannot provide fixes for all of them because the
>> probably trigger bugs in ifort. Because ifort is already deprecated, these
>> bugs will probably not be fixed. Furthermore, we do not see any issues on
>> our Intel CI. I will fix what I can but some of them will be left as we
>> will focus our efforts on the support of the new ifx compiler.
>>
>> Frederick Stein schrieb am Freitag, 25. Oktober 2024 um 11:46:00 UTC+2:
>>
>>> Dear Bartosz,
>>> I will check the other issues with your regtests.
>>> Regarding your latest issue, please provide more information such as an
>>> output file or a hint on the context. If I am supposed to retry the
>>> calculation on my local machine, I need all additional input files such as
>>> your plumed file. I can run your input file up to the point that CP2K needs
>>> plumed.
>>> Best,
>>> Frederick
>>> bartosz mazur schrieb am Freitag, 25. Oktober 2024 um 10:15:19 UTC+2:
>>>
>>>> I just got another error with LibXSMM, now in my regular simulation and
>>>> without using OpenMP. This is the error:
>>>>
>>>> ```
>>>> [1729843139.920274] [r23c01b04:2913 :0] ib_md.c:295 UCX
>>>> ERROR ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed:
>>>> Cannot allocate memory
>>>> [1729843139.920290] [r23c01b04:2913 :0] ucp_mm.c:70 UCX
>>>> ERROR failed to register address 0x14f0b46fc080 (host) length 7424 on
>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>
>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647]
>>>> [r23c01b04:2945 :0] ib_md.c:295 UCX ERROR
>>>> ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot
>>>> allocate memory
>>>> [1729843139.932660] [r23c01b04:2945 :0] ucp_mm.c:70 UCX
>>>> ERROR failed to register address 0x1491f069e040 (host) length 8128 on
>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>
>>>>
>>>> CLX/DP TRY JIT STA COL
>>>> 0..13 4 4 0 0
>>>> 14..23 4 4 0 0
>>>>
>>>> 24..64 0 0 0 0
>>>> Registry and code: 13 MB + 80 KB (gemm=8)
>>>> Command (PID=2913):
>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>> cp2k.inp -o cp2k.out
>>>> Uptime: 407633.177169 s
>>>> ```
>>>>
>>>> and this is simulation input I'm using:
>>>>
>>>> ```
>>>> &GLOBAL
>>>> PROJECT uam1o_npt_rms
>>>> RUN_TYPE MD
>>>> PRINT_LEVEL LOW
>>>> PREFERRED_DIAG_LIBRARY SCALAPACK
>>>> &END GLOBAL
>>>>
>>>> &FORCE_EVAL
>>>> METHOD QUICKSTEP
>>>> STRESS_TENSOR ANALYTICAL
>>>> &DFT
>>>> BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
>>>> POTENTIAL_FILE_NAME POTENTIAL_UZH
>>>> &MGRID
>>>> CUTOFF 500
>>>> &END MGRID
>>>> &XC
>>>> &XC_FUNCTIONAL PBE
>>>> &END XC_FUNCTIONAL
>>>> &VDW_POTENTIAL
>>>> POTENTIAL_TYPE PAIR_POTENTIAL
>>>> &PAIR_POTENTIAL
>>>> TYPE DFTD3(BJ)
>>>> PARAMETER_FILE_NAME dftd3.dat
>>>> REFERENCE_FUNCTIONAL PBE
>>>> R_CUTOFF 25.0
>>>> &END PAIR_POTENTIAL
>>>> &END VDW_POTENTIAL
>>>> &END XC
>>>> &END DFT
>>>>
>>>> &SUBSYS
>>>> &CELL
>>>> A 12.2807999 0.0000000 0.0000000
>>>> B 7.6258602 9.6257200 0.0000000
>>>> C -2.1557724 -1.0420258 18.0042801
>>>> &END CELL
>>>> &COORD
>>>> Zn 11.37811 4.60286 0.24515
>>>> Zn 8.15435 3.05288 8.74518
>>>> Zn 6.37590 3.97311 17.74650
>>>> Zn 9.59842 5.54014 9.24747
>>>> S 11.79344 6.72692 17.10850
>>>> S 4.06825 3.00573 9.90358
>>>> S 5.95830 1.84422 0.90027
>>>> S 13.67407 5.58944 8.10767
>>>> O 10.72408 3.58291 1.89315
>>>> O 8.51986 4.01962 1.53085
>>>> O 6.60135 3.91587 7.68572
>>>> O 7.74637 5.79259 8.21600
>>>> O 15.32810 8.58246 5.10041
>>>> O 9.35608 2.93551 7.09500
>>>> O 10.38999 4.93007 7.45977
>>>> O 11.66491 6.35111 1.31266
>>>> O 9.48582 6.62478 0.77364
>>>> O 2.59062 2.40094 3.91496
>>>> O 7.03031 4.99173 16.09885
>>>> O 9.23544 4.56122 16.46252
>>>> O 11.14602 4.67776 10.31440
>>>> O 10.00982 2.79915 9.77218
>>>> O 2.41388 0.01898 12.91899
>>>> O 8.39375 5.66143 10.89628
>>>> O 7.36998 3.66087 10.53589
>>>> O 6.08863 2.22161 16.68336
>>>> O 8.26988 1.95313 17.21650
>>>> O 15.16937 6.16381 14.09906
>>>> N 13.25907 3.80728 0.04001
>>>> N 2.36335 -0.74130 17.33402
>>>> N 7.60676 1.08576 8.95623
>>>> N 15.77729 5.75974 9.67861
>>>> N 4.49430 4.76652 17.95756
>>>> N 15.38873 9.31230 0.67467
>>>> N 10.14308 7.50848 9.04236
>>>> N 1.96529 2.83557 8.33233
>>>> C 6.76554 5.18292 7.68414
>>>> C 14.28210 4.11624 0.86006
>>>> C 9.47998 3.39622 2.09658
>>>> C 3.20112 3.42080 0.84626
>>>> C 9.91466 1.18589 3.17244
>>>> C 9.08210 2.29987 3.02657
>>>> C 5.74710 6.04945 7.01821
>>>> C 7.83265 2.30920 3.66005
>>>> C 3.35793 2.34328 -0.04029
>>>> C 4.51663 1.46385 -0.02755
>>>> C 16.24194 7.75266 5.73606
>>>> C 4.78940 5.52817 6.14198
>>>> C 7.40810 1.21174 4.39947
>>>> C 16.18016 6.38244 5.49010
>>>> C 9.48869 0.06986 3.88005
>>>> C 11.27238 1.77457 17.14330
>>>> C 5.77166 7.43009 7.27236
>>>> C 11.14819 8.24901 17.58588
>>>> C 8.22170 0.08058 4.47135
>>>> C 0.15087 1.02286 17.07544
>>>> C 17.16180 8.28565 6.64351
>>>> C 10.57067 7.01060 1.31282
>>>> C 6.72654 0.47459 8.14002
>>>> C 10.27972 3.79035 6.89470
>>>> C 14.15006 8.72843 8.15880
>>>> C 11.73751 2.06868 5.82537
>>>> C 11.38838 3.41515 5.96966
>>>> C 10.52304 8.34339 1.98566
>>>> C 12.16584 4.39562 5.33967
>>>> C 14.89762 7.93801 9.04648
>>>> C 14.86698 6.48365 9.03575
>>>> C 2.67167 1.17044 3.27681
>>>> C 11.52468 8.76552 2.86608
>>>> C 13.29140 4.04007 4.60622
>>>> C 3.78230 0.36534 3.52266
>>>> C 12.87823 1.70260 5.12344
>>>> C 8.27761 0.34001 9.85941
>>>> C 9.42677 9.18364 1.73295
>>>> C 3.27553 4.45658 9.42657
>>>> C 13.66559 2.69775 4.53650
>>>> C 15.77023 8.59069 9.93240
>>>> C 1.68356 0.78491 2.36643
>>>> C 10.98451 3.41041 10.31327
>>>> C 3.46873 4.45681 17.14097
>>>> C 8.27403 5.18373 15.89814
>>>> C 14.54907 5.15099 17.15930
>>>> C 7.83119 7.39584 14.82858
>>>> C 8.66916 6.28563 14.97331
>>>> C 11.99928 2.54577 10.98702
>>>> C 9.92072 6.28547 14.34388
>>>> C 16.54982 7.26986 0.04271
>>>> C 15.39103 8.14919 0.03189
>>>> C 1.50023 0.84646 12.27989
>>>> C 12.95126 3.06908 11.86817
>>>> C 10.34198 7.38826 13.61070
>>>> C 1.55836 2.21699 12.52561
>>>> C 8.25354 8.51697 14.12666
>>>> C 6.48249 6.79770 0.85630
>>>> C 11.97760 1.16465 10.73446
>>>> C 6.60385 0.32218 0.42301
>>>> C 9.52282 8.51550 13.54043
>>>> C 17.60321 7.54791 0.92891
>>>> C 0.58530 0.31102 11.36884
>>>> C 7.18362 1.56332 16.68291
>>>> C 11.01926 8.11905 9.86341
>>>> C 7.47582 4.80132 11.10039
>>>> C 3.59282 -0.13430 9.84955
>>>> C 6.01179 6.51430 12.17471
>>>> C 6.36853 5.17005 12.02942
>>>> C 7.23131 0.22715 16.01652
>>>> C 5.59963 4.18477 12.66234
>>>> C 2.84614 0.65728 8.96213
>>>> C 2.87561 2.11161 8.97508
>>>> C 15.08536 7.39548 14.73440
>>>> C 6.23001 -0.19920 15.13769
>>>> C 4.47482 4.53325 13.40042
>>>> C 13.97400 8.19851 14.48576
>>>> C 4.87173 6.87322 12.88120
>>>> C 9.47231 8.25578 8.14046
>>>> C 8.32790 -0.61137 16.27301
>>>> C 14.46698 4.13864 8.58475
>>>> C 4.09294 5.87331 13.47165
>>>> C 1.97640 0.00563 8.07267
>>>> C 16.07240 7.78504 15.64417
>>>> H 14.10215 4.93465 1.55678
>>>> H 3.98110 3.68721 1.55899
>>>> H 10.89072 1.19647 2.69205
>>>> H 7.19958 3.19021 3.56839
>>>> H 4.75923 4.45384 5.96230
>>>> H 6.45299 1.21835 4.92062
>>>> H 15.44211 6.00062 4.78824
>>>> H 17.75043 8.81610 3.97156
>>>> H 10.41563 1.57993 16.49923
>>>> H 6.49332 7.81303 7.99143
>>>> H 0.24800 0.19739 16.37425
>>>> H 9.53586 -0.26872 6.84508
>>>> H 6.19685 1.12218 7.44173
>>>> H 13.45550 8.28133 7.44815
>>>> H 11.11633 1.31384 6.30260
>>>> H 11.87413 5.44074 5.42962
>>>> H 12.38442 8.12016 3.04474
>>>> H 13.88694 4.78876 4.08791
>>>> H 4.53915 0.70283 4.22717
>>>> H 0.88557 0.65625 5.03328
>>>> H 8.96418 0.89159 10.50060
>>>> H 8.67994 8.85961 1.01083
>>>> H 16.35704 8.00331 10.63471
>>>> H 13.12606 1.45212 2.16563
>>>> H 3.64702 3.63930 16.44281
>>>> H 13.76743 4.88477 16.44833
>>>> H 6.85355 7.37827 15.30535
>>>> H 10.55820 5.40745 14.43410
>>>> H 12.97886 4.14375 12.04672
>>>> H 11.29905 7.38966 13.09313
>>>> H 2.29216 2.60091 13.23073
>>>> H -0.01303 -0.23279 14.03603
>>>> H 7.34113 6.99275 1.49776
>>>> H 11.26049 0.78023 10.01184
>>>> H 17.50743 8.37258 1.63130
>>>> H 8.21398 8.86531 11.16822
>>>> H 11.54834 7.47018 10.56097
>>>> H 4.28503 0.31205 10.56295
>>>> H 6.62643 7.27289 11.69479
>>>> H 5.89748 3.14154 12.57118
>>>> H 5.36986 0.44461 14.95599
>>>> H 3.88656 3.78035 13.92095
>>>> H 13.21826 7.85764 13.78163
>>>> H 16.85773 7.91771 12.97237
>>>> H 8.78884 7.70469 7.49554
>>>> H 9.07452 -0.28399 16.99402
>>>> H 1.39009 0.59398 7.37083
>>>> H 4.63062 7.11938 15.84758
>>>> &END COORD
>>>> &KIND Zn
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
>>>> POTENTIAL GTH-PBE-q12
>>>> &END KIND
>>>> &KIND S
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>> POTENTIAL GTH-PBE-q6
>>>> &END KIND
>>>> &KIND O
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>> POTENTIAL GTH-PBE-q6
>>>> &END KIND
>>>> &KIND N
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
>>>> POTENTIAL GTH-PBE-q5
>>>> &END KIND
>>>> &KIND C
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
>>>> POTENTIAL GTH-PBE-q4
>>>> &END KIND
>>>> &KIND H
>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
>>>> POTENTIAL GTH-PBE-q1
>>>> &END KIND
>>>> &END SUBSYS
>>>> &END FORCE_EVAL
>>>>
>>>> &MOTION
>>>> &MD
>>>> ENSEMBLE NPT_I
>>>> TEMPERATURE 298
>>>> TIMESTEP 1.0
>>>> STEPS 50000
>>>> &THERMOSTAT
>>>> TYPE NOSE
>>>> &NOSE
>>>> LENGTH 3
>>>> YOSHIDA 3
>>>> TIMECON 1000
>>>> &END NOSE
>>>> &END THERMOSTAT
>>>> &BAROSTAT
>>>> PRESSURE 1.0
>>>> TIMECON 4000
>>>> &END BAROSTAT
>>>> &END MD
>>>> &FREE_ENERGY
>>>> METHOD METADYN
>>>> &METADYN
>>>> USE_PLUMED .TRUE.
>>>> PLUMED_INPUT_FILE plumed.dat
>>>> &END METADYN
>>>> &END FREE_ENERGY
>>>> &PRINT
>>>> &TRAJECTORY
>>>> &EACH
>>>> MD 5
>>>> &END EACH
>>>> &END TRAJECTORY
>>>> &FORCES
>>>> UNIT eV*angstrom^-1
>>>> &EACH
>>>> MD 5
>>>> &END EACH
>>>> &END FORCES
>>>> &CELL
>>>> &EACH
>>>> MD 5
>>>> &END EACH
>>>> &END CELL
>>>> &END PRINT
>>>> &END MOTION
>>>> ```
>>>>
>>>> This simulation was performed with previous version of cp2k (so without
>>>> your fix).
>>>> piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):
>>>>
>>>>> Hi Frederick,
>>>>>
>>>>> it helped with most of the tests! Now only 13 have failed. In the
>>>>> attachments you will find full output from regtests and here is output from
>>>>> single job with TRACE enabled:
>>>>>
>>>>> ```
>>>>> Loading intel/2024a
>>>>> Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>>>>> binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>>>>> numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>>>>> impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>
>>>>> Currently Loaded Modulefiles:
>>>>> 1) GCCcore/13.3.0 7)
>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>
>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>
>>>>> 4) intel-compilers/2024.2.0 10) imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>
>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>
>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>> 2 MPI processes with 2 OpenMP threads each
>>>>> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
>>>>> SIRIUS 7.6.1, git hash:
>>>>> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
>>>>> Warning! Compiled in 'debug' mode with assert statements enabled!
>>>>>
>>>>>
>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>> CLX/DP TRY JIT STA COL
>>>>> 0..13 8 8 0 0
>>>>> 14..23 0 0 0 0
>>>>> 24..64 0 0 0 0
>>>>> Registry and code: 13 MB + 64 KB (gemm=8)
>>>>> Command (PID=423503):
>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>> dftd3src1.inp -o dftd3src1.out
>>>>> Uptime: 2.752513 s
>>>>>
>>>>>
>>>>>
>>>>> ===================================================================================
>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>> = RANK 0 PID 423503 RUNNING AT r21c01b03
>>>>>
>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>
>>>>> ===================================================================================
>>>>>
>>>>>
>>>>> ===================================================================================
>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>> = RANK 1 PID 423504 RUNNING AT r21c01b03
>>>>>
>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>
>>>>> ===================================================================================
>>>>> finished at Fri Oct 25 09:34:39 CEST 2024
>>>>> ```
>>>>>
>>>>> and the last lines:
>>>>>
>>>>> ```
>>>>> 000000:000002<< 13 3
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 13 4
>>>>> mp_sendrecv_dm2
>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 13 4
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 12 2
>>>>> pw_nn_compose_r 0
>>>>> .003 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 1 xc_pw_derive
>>>>> 0.003 H
>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 5 pw_zero
>>>>> start Hostme
>>>>> m: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 5 pw_zero
>>>>> 0.000 Hostme
>>>>> m: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 2 xc_pw_derive
>>>>> start H
>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 12 3
>>>>> pw_nn_compose_r s
>>>>> tart Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 13 5
>>>>> mp_sendrecv_dm2
>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 13 5
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 13 6
>>>>> mp_sendrecv_dm2
>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 13 6
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 12 3
>>>>> pw_nn_compose_r 0
>>>>> .002 Hostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 2 xc_pw_derive
>>>>> 0.002 H
>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 6 pw_zero
>>>>> start Hostme
>>>>> m: 955 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 6 pw_zero
>>>>> 0.001 Hostme
>>>>> m: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 3 xc_pw_derive
>>>>> start H
>>>>> ostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 12 4
>>>>> pw_nn_compose_r s
>>>>> tart Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 13 7
>>>>> mp_sendrecv_dm2
>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002<< 13 7
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 13 8
>>>>> mp_sendrecv_dm2
>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002<< 13 8
>>>>> mp_sendrecv_dm2
>>>>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002<< 12 4
>>>>> pw_nn_compose_r 0
>>>>> .002 Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 3 xc_pw_derive
>>>>> 0.002 H
>>>>> ostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 1
>>>>> pw_spline_scale_deriv
>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 1
>>>>> pw_spline_scale_deriv
>>>>> 0.001 Hostmem: 960 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 20
>>>>> pw_pool_give_back_pw
>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 20
>>>>> pw_pool_give_back_pw
>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 21
>>>>> pw_pool_give_back_pw
>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 21
>>>>> pw_pool_give_back_pw
>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 22
>>>>> pw_pool_give_back_pw
>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 22
>>>>> pw_pool_give_back_pw
>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 23
>>>>> pw_pool_give_back_pw
>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 23
>>>>> pw_pool_give_back_pw
>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002>> 11 1
>>>>> xc_functional_eval s
>>>>> tart Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002>> 12 1 b97_lda_eval
>>>>> star
>>>>> t Hostmem: 965 MB GPUmem: 0 MB
>>>>> 000000:000002<< 12 1 b97_lda_eval
>>>>> 0.10
>>>>> 3 Hostmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 11 1
>>>>> xc_functional_eval 0
>>>>> .103 Hostmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 1
>>>>> xc_rho_set_and_dset_create
>>>>> 0.120 Hostmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002>> 10 1
>>>>> check_for_derivatives s
>>>>> tart Hostmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 1
>>>>> check_for_derivatives 0
>>>>> .000 Hostmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002>> 10 14 pw_create_r3d
>>>>> start Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 14 pw_create_r3d
>>>>> 0.000 Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002>> 10 15 pw_create_r3d
>>>>> start Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 15 pw_create_r3d
>>>>> 0.000 Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002>> 10 16 pw_create_r3d
>>>>> start Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 16 pw_create_r3d
>>>>> 0.000 Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002>> 10 17 pw_create_r3d
>>>>> start Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> 000000:000002<< 10 17 pw_create_r3d
>>>>> 0.000 Hos
>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>> ```
>>>>>
>>>>> Best
>>>>> Bartosz
>>>>>
>>>>> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein
>>>>> napisał(a):
>>>>>
>>>>>> Dear Bartosz,
>>>>>> My fix is merged. Can you switch to the CP2K master and try it again?
>>>>>> We are still working on a few issues with the Intel compilers such that we
>>>>>> may eventually migrate from ifort to ifx.
>>>>>> Best,
>>>>>> Frederick
>>>>>>
>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>>>>>
>>>>>>> Great! Thank you for your help.
>>>>>>>
>>>>>>> Best
>>>>>>> Bartosz
>>>>>>>
>>>>>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> I have a fix for it. In contrast to my first thought, it is a case
>>>>>>>> of invalid type conversion from real to complex numbers (yes, Fortran is
>>>>>>>> rather strict about it) in pw_derive. This may also be present in a few
>>>>>>>> other spots. I am currently running more tests and I will open a pull
>>>>>>>> request within the next few days.
>>>>>>>> Best,
>>>>>>>> Frederick
>>>>>>>>
>>>>>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49
>>>>>>>> UTC+2:
>>>>>>>>
>>>>>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>>>>>
>>>>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> I was loading it as it was needed for compilation. I have
>>>>>>>>>> unloaded the module, but the error still occurs:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>> Command (PID=15485):
>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>> Uptime: 1.757102 s
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> = RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>>>>>
>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> = RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>>>>>
>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> and the last 100 lines:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> 000000:000002>> 11 37
>>>>>>>>>> pw_create_c1d start
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 37
>>>>>>>>>> pw_create_c1d 0.000
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 10 64
>>>>>>>>>> pw_pool_create_pw 0.000
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 10 25 pw_copy
>>>>>>>>>> start Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 10 25 pw_copy
>>>>>>>>>> 0.001 Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 10 17 pw_axpy
>>>>>>>>>> start Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 10 17 pw_axpy
>>>>>>>>>> 0.001 Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 10 19 mp_sum_d
>>>>>>>>>> start Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 10 19 mp_sum_d
>>>>>>>>>> 0.000 Hostmem:
>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 10 3
>>>>>>>>>> pw_poisson_solve start
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>> pw_poisson_rebuild s
>>>>>>>>>> tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>> pw_poisson_rebuild 0
>>>>>>>>>> .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 65
>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>> art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 38
>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 38
>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>> 00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 65
>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>> 000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 26 pw_copy
>>>>>>>>>> start Hostme
>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 26 pw_copy
>>>>>>>>>> 0.001 Hostme
>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>> pw_multiply_with sta
>>>>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>> pw_multiply_with 0.0
>>>>>>>>>> 01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 27 pw_copy
>>>>>>>>>> start Hostme
>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 27 pw_copy
>>>>>>>>>> 0.001 Hostme
>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>> pw_integral_ab start
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 20 mp_sum_d
>>>>>>>>>> start Ho
>>>>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 20 mp_sum_d
>>>>>>>>>> 0.001 Ho
>>>>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>> pw_integral_ab 0.004
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 11 4
>>>>>>>>>> pw_poisson_set start
>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 66
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 39
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 39
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 66
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 28 pw_copy
>>>>>>>>>> start Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 28 pw_copy
>>>>>>>>>> 0.001 Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>> pw_derive start H
>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>> pw_derive 0.002 H
>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 67
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 40
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 40
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 67
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 29 pw_copy
>>>>>>>>>> start Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 29 pw_copy
>>>>>>>>>> 0.001 Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 8
>>>>>>>>>> pw_derive start H
>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 8
>>>>>>>>>> pw_derive 0.002 H
>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 68
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 41
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 41
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 68
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 30 pw_copy
>>>>>>>>>> start Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 12 30 pw_copy
>>>>>>>>>> 0.001 Hos
>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 12 9
>>>>>>>>>> pw_derive start H
>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> This is the list of currently loaded modules (all come with
>>>>>>>>>> intel):
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> Currently Loaded Modulefiles:
>>>>>>>>>> 1) GCCcore/13.3.0 7)
>>>>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>>>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>>>>>>
>>>>>>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>>>>>>
>>>>>>>>>> 4) intel-compilers/2024.2.0 10)
>>>>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>>>>>>
>>>>>>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>>>>>>> ```
>>>>>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>> I am currently running some tests with the latest Intel compiler
>>>>>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why
>>>>>>>>>>> is it loaded? Can you unload it? This would at least reduce potential
>>>>>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>>>>>> Best,
>>>>>>>>>>> Frederick
>>>>>>>>>>>
>>>>>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> The error for ssmp is:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>>> 0..13 4 4 0 0
>>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>>>>>> Command (PID=54845):
>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>> Uptime: 2.861583 s
>>>>>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36:
>>>>>>>>>>>> 54845 Segmentation fault (core dumped)
>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> and the last 100 lines of output:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> 000000:000001>> 12 20
>>>>>>>>>>>> mp_sum_d start Ho
>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 20
>>>>>>>>>>>> mp_sum_d 0.000 Ho
>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 11 13
>>>>>>>>>>>> dbcsr_dot_sd 0.000 H
>>>>>>>>>>>> ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 10 12
>>>>>>>>>>>> calculate_ptrace_kp 0.0
>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 9 6
>>>>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 9 6
>>>>>>>>>>>> rebuild_ks_matrix start Ho
>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 10 6
>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 11 140
>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 79
>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 79
>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 11 140
>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 11 141
>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 80
>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 80
>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 11 141
>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 11 61 pw_copy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 11 61 pw_copy
>>>>>>>>>>>> 0.004 Hostme
>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 11 35 pw_axpy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 11 35 pw_axpy
>>>>>>>>>>>> 0.002 Hostme
>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 11 6
>>>>>>>>>>>> pw_poisson_solve sta
>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 142
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 81
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 81
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 142
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 62
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 62
>>>>>>>>>>>> pw_copy 0.003 Hos
>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>> 0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 63
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 63
>>>>>>>>>>>> pw_copy 0.003 Hos
>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>> pw_integral_ab st
>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>> pw_integral_ab 0.
>>>>>>>>>>>> 005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 12 7
>>>>>>>>>>>> pw_poisson_set st
>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 143
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 14 82
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 14 82
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 143
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 64
>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 64
>>>>>>>>>>>> pw_copy 0.003
>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 16
>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 16
>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>> 6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 144
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 14 83
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 14 83
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 144
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 65
>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001<< 13 65
>>>>>>>>>>>> pw_copy 0.004
>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000001>> 13 17
>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> 000000:000002<< 9 7
>>>>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 9 7
>>>>>>>>>>>> rebuild_ks_matrix start Ho
>>>>>>>>>>>>
>>>>>>>>>>>> stmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 10 7
>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 164
>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 93
>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 93
>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 164
>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 165
>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 94
>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 94
>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 165
>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 73 pw_copy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>
>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 73 pw_copy
>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>>
>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 41 pw_axpy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>
>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 41 pw_axpy
>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>>
>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 52 mp_sum_d
>>>>>>>>>>>> start Hostm
>>>>>>>>>>>>
>>>>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 52 mp_sum_d
>>>>>>>>>>>> 0.000 Hostm
>>>>>>>>>>>>
>>>>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 7
>>>>>>>>>>>> pw_poisson_solve sta
>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 166
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 95
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 95
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 166
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 74
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>
>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 74
>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>>
>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>> 0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 75
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>
>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 75
>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>>
>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>> pw_integral_ab st
>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 53
>>>>>>>>>>>> mp_sum_d start
>>>>>>>>>>>>
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 53
>>>>>>>>>>>> mp_sum_d 0.000
>>>>>>>>>>>>
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>> pw_integral_ab 0.
>>>>>>>>>>>> 003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 8
>>>>>>>>>>>> pw_poisson_set st
>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 167
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 14 96
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 14 96
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 167
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 76
>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 14 97
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 14 97
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>
>>>>>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick
>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>>>>>> Regarding the trace, I do not know either as there is not much
>>>>>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the
>>>>>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually
>>>>>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last
>>>>>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces
>>>>>>>>>>>>> with the psmp version.
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The error is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>>>>> Command (PID=2607388):
>>>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>> = RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>> = RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 14
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 14
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein
>>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE
>>>>>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This
>>>>>>>>>>>>>>> increases the size of the output file dramatically (to some million lines).
>>>>>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um
>>>>>>>>>>>>>>> 17:09:40 UTC+2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I
>>>>>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for
>>>>>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp
>>>>>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same
>>>>>>>>>>>>>>>> setting, I provide example output as attachment.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick
>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1
>>>>>>>>>>>>>>>>> (add '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of
>>>>>>>>>>>>>>>>> the ssmp?
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um
>>>>>>>>>>>>>>>>> 15:37:43 UTC+2:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> thanks again for help. So I have tested different
>>>>>>>>>>>>>>>>>> simulation variants and I know that the problem occurs when using OMP. For
>>>>>>>>>>>>>>>>>> MPI calculations without OMP all tests pass. I have also tested the effect
>>>>>>>>>>>>>>>>>> of the `OMP_PROC_BIND` and `OMP_PLACES` parameters and
>>>>>>>>>>>>>>>>>> apart from the effect on simulation time, they have no significant effect
>>>>>>>>>>>>>>>>>> on the presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed,
>>>>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed:
>>>>>>>>>>>>>>>>>> 130; 495min
>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed:
>>>>>>>>>>>>>>>>>> 94; 484min
>>>>>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed:
>>>>>>>>>>>>>>>>>> 74; 563min
>>>>>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed:
>>>>>>>>>>>>>>>>>> 74; 556min
>>>>>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121;
>>>>>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 /
>>>>>>>>>>>>>>>>>> 4227; failed: 98; 263min
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Any ideas what I could do next to have more information
>>>>>>>>>>>>>>>>>> about the source of the problem or maybe you see a potential solution at
>>>>>>>>>>>>>>>>>> this stage? I would appreciate any further help.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick
>>>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The
>>>>>>>>>>>>>>>>>>> test do not run that efficiently with such a large number of threads. 2
>>>>>>>>>>>>>>>>>>> should be sufficient.
>>>>>>>>>>>>>>>>>>> The test result suggests that most of the functionality
>>>>>>>>>>>>>>>>>>> may work but due to a missing backtrace (or similar information), it is
>>>>>>>>>>>>>>>>>>> hard to tell why they fail. You could also try to run some of the
>>>>>>>>>>>>>>>>>>> single-node tests to assess the stability of CP2K.
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um
>>>>>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/48b72f1a-c321-4833-aeb9-1f747967acfcn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3144902.out
Type: application/octet-stream
Size: 9226 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0005.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3127239.out
Type: application/octet-stream
Size: 8643 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3164366.out
Type: application/octet-stream
Size: 8697 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3117616.out
Type: application/octet-stream
Size: 23776 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3098731.out
Type: application/octet-stream
Size: 8453 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0009.obj>
More information about the CP2K-user
mailing list