[CP2K-user] [CP2K:20813] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types
bartosz mazur
bamaz.97 at gmail.com
Fri Oct 25 08:07:03 UTC 2024
I just got another error with LibXSMM, now in my regular simulation and
without using OpenMP. This is the error:
```
[1729843139.920274] [r23c01b04:2913 :0] ib_md.c:295 UCX ERROR
ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed: Cannot
allocate memory
[1729843139.920290] [r23c01b04:2913 :0] ucp_mm.c:70 UCX ERROR
failed to register address 0x14f0b46fc080 (host) length 7424 on
md[4]=mlx5_0: Input/output error (md supports: host)
LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647]
[r23c01b04:2945 :0] ib_md.c:295 UCX ERROR
ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot
allocate memory
[1729843139.932660] [r23c01b04:2945 :0] ucp_mm.c:70 UCX ERROR
failed to register address 0x1491f069e040 (host) length 8128 on
md[4]=mlx5_0: Input/output error (md supports: host)
CLX/DP TRY JIT STA COL
0..13 4 4 0 0
14..23 4 4 0 0
24..64 0 0 0 0
Registry and code: 13 MB + 80 KB (gemm=8)
Command (PID=2913):
/lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
cp2k.inp -o cp2k.out
Uptime: 407633.177169 s
```
and this is simulation input I'm using:
```
&GLOBAL
PROJECT uam1o_npt_rms
RUN_TYPE MD
PRINT_LEVEL LOW
PREFERRED_DIAG_LIBRARY SCALAPACK
&END GLOBAL
&FORCE_EVAL
METHOD QUICKSTEP
STRESS_TENSOR ANALYTICAL
&DFT
BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
POTENTIAL_FILE_NAME POTENTIAL_UZH
&MGRID
CUTOFF 500
&END MGRID
&XC
&XC_FUNCTIONAL PBE
&END XC_FUNCTIONAL
&VDW_POTENTIAL
POTENTIAL_TYPE PAIR_POTENTIAL
&PAIR_POTENTIAL
TYPE DFTD3(BJ)
PARAMETER_FILE_NAME dftd3.dat
REFERENCE_FUNCTIONAL PBE
R_CUTOFF 25.0
&END PAIR_POTENTIAL
&END VDW_POTENTIAL
&END XC
&END DFT
&SUBSYS
&CELL
A 12.2807999 0.0000000 0.0000000
B 7.6258602 9.6257200 0.0000000
C -2.1557724 -1.0420258 18.0042801
&END CELL
&COORD
Zn 11.37811 4.60286 0.24515
Zn 8.15435 3.05288 8.74518
Zn 6.37590 3.97311 17.74650
Zn 9.59842 5.54014 9.24747
S 11.79344 6.72692 17.10850
S 4.06825 3.00573 9.90358
S 5.95830 1.84422 0.90027
S 13.67407 5.58944 8.10767
O 10.72408 3.58291 1.89315
O 8.51986 4.01962 1.53085
O 6.60135 3.91587 7.68572
O 7.74637 5.79259 8.21600
O 15.32810 8.58246 5.10041
O 9.35608 2.93551 7.09500
O 10.38999 4.93007 7.45977
O 11.66491 6.35111 1.31266
O 9.48582 6.62478 0.77364
O 2.59062 2.40094 3.91496
O 7.03031 4.99173 16.09885
O 9.23544 4.56122 16.46252
O 11.14602 4.67776 10.31440
O 10.00982 2.79915 9.77218
O 2.41388 0.01898 12.91899
O 8.39375 5.66143 10.89628
O 7.36998 3.66087 10.53589
O 6.08863 2.22161 16.68336
O 8.26988 1.95313 17.21650
O 15.16937 6.16381 14.09906
N 13.25907 3.80728 0.04001
N 2.36335 -0.74130 17.33402
N 7.60676 1.08576 8.95623
N 15.77729 5.75974 9.67861
N 4.49430 4.76652 17.95756
N 15.38873 9.31230 0.67467
N 10.14308 7.50848 9.04236
N 1.96529 2.83557 8.33233
C 6.76554 5.18292 7.68414
C 14.28210 4.11624 0.86006
C 9.47998 3.39622 2.09658
C 3.20112 3.42080 0.84626
C 9.91466 1.18589 3.17244
C 9.08210 2.29987 3.02657
C 5.74710 6.04945 7.01821
C 7.83265 2.30920 3.66005
C 3.35793 2.34328 -0.04029
C 4.51663 1.46385 -0.02755
C 16.24194 7.75266 5.73606
C 4.78940 5.52817 6.14198
C 7.40810 1.21174 4.39947
C 16.18016 6.38244 5.49010
C 9.48869 0.06986 3.88005
C 11.27238 1.77457 17.14330
C 5.77166 7.43009 7.27236
C 11.14819 8.24901 17.58588
C 8.22170 0.08058 4.47135
C 0.15087 1.02286 17.07544
C 17.16180 8.28565 6.64351
C 10.57067 7.01060 1.31282
C 6.72654 0.47459 8.14002
C 10.27972 3.79035 6.89470
C 14.15006 8.72843 8.15880
C 11.73751 2.06868 5.82537
C 11.38838 3.41515 5.96966
C 10.52304 8.34339 1.98566
C 12.16584 4.39562 5.33967
C 14.89762 7.93801 9.04648
C 14.86698 6.48365 9.03575
C 2.67167 1.17044 3.27681
C 11.52468 8.76552 2.86608
C 13.29140 4.04007 4.60622
C 3.78230 0.36534 3.52266
C 12.87823 1.70260 5.12344
C 8.27761 0.34001 9.85941
C 9.42677 9.18364 1.73295
C 3.27553 4.45658 9.42657
C 13.66559 2.69775 4.53650
C 15.77023 8.59069 9.93240
C 1.68356 0.78491 2.36643
C 10.98451 3.41041 10.31327
C 3.46873 4.45681 17.14097
C 8.27403 5.18373 15.89814
C 14.54907 5.15099 17.15930
C 7.83119 7.39584 14.82858
C 8.66916 6.28563 14.97331
C 11.99928 2.54577 10.98702
C 9.92072 6.28547 14.34388
C 16.54982 7.26986 0.04271
C 15.39103 8.14919 0.03189
C 1.50023 0.84646 12.27989
C 12.95126 3.06908 11.86817
C 10.34198 7.38826 13.61070
C 1.55836 2.21699 12.52561
C 8.25354 8.51697 14.12666
C 6.48249 6.79770 0.85630
C 11.97760 1.16465 10.73446
C 6.60385 0.32218 0.42301
C 9.52282 8.51550 13.54043
C 17.60321 7.54791 0.92891
C 0.58530 0.31102 11.36884
C 7.18362 1.56332 16.68291
C 11.01926 8.11905 9.86341
C 7.47582 4.80132 11.10039
C 3.59282 -0.13430 9.84955
C 6.01179 6.51430 12.17471
C 6.36853 5.17005 12.02942
C 7.23131 0.22715 16.01652
C 5.59963 4.18477 12.66234
C 2.84614 0.65728 8.96213
C 2.87561 2.11161 8.97508
C 15.08536 7.39548 14.73440
C 6.23001 -0.19920 15.13769
C 4.47482 4.53325 13.40042
C 13.97400 8.19851 14.48576
C 4.87173 6.87322 12.88120
C 9.47231 8.25578 8.14046
C 8.32790 -0.61137 16.27301
C 14.46698 4.13864 8.58475
C 4.09294 5.87331 13.47165
C 1.97640 0.00563 8.07267
C 16.07240 7.78504 15.64417
H 14.10215 4.93465 1.55678
H 3.98110 3.68721 1.55899
H 10.89072 1.19647 2.69205
H 7.19958 3.19021 3.56839
H 4.75923 4.45384 5.96230
H 6.45299 1.21835 4.92062
H 15.44211 6.00062 4.78824
H 17.75043 8.81610 3.97156
H 10.41563 1.57993 16.49923
H 6.49332 7.81303 7.99143
H 0.24800 0.19739 16.37425
H 9.53586 -0.26872 6.84508
H 6.19685 1.12218 7.44173
H 13.45550 8.28133 7.44815
H 11.11633 1.31384 6.30260
H 11.87413 5.44074 5.42962
H 12.38442 8.12016 3.04474
H 13.88694 4.78876 4.08791
H 4.53915 0.70283 4.22717
H 0.88557 0.65625 5.03328
H 8.96418 0.89159 10.50060
H 8.67994 8.85961 1.01083
H 16.35704 8.00331 10.63471
H 13.12606 1.45212 2.16563
H 3.64702 3.63930 16.44281
H 13.76743 4.88477 16.44833
H 6.85355 7.37827 15.30535
H 10.55820 5.40745 14.43410
H 12.97886 4.14375 12.04672
H 11.29905 7.38966 13.09313
H 2.29216 2.60091 13.23073
H -0.01303 -0.23279 14.03603
H 7.34113 6.99275 1.49776
H 11.26049 0.78023 10.01184
H 17.50743 8.37258 1.63130
H 8.21398 8.86531 11.16822
H 11.54834 7.47018 10.56097
H 4.28503 0.31205 10.56295
H 6.62643 7.27289 11.69479
H 5.89748 3.14154 12.57118
H 5.36986 0.44461 14.95599
H 3.88656 3.78035 13.92095
H 13.21826 7.85764 13.78163
H 16.85773 7.91771 12.97237
H 8.78884 7.70469 7.49554
H 9.07452 -0.28399 16.99402
H 1.39009 0.59398 7.37083
H 4.63062 7.11938 15.84758
&END COORD
&KIND Zn
BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
POTENTIAL GTH-PBE-q12
&END KIND
&KIND S
BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
POTENTIAL GTH-PBE-q6
&END KIND
&KIND O
BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
POTENTIAL GTH-PBE-q6
&END KIND
&KIND N
BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
POTENTIAL GTH-PBE-q5
&END KIND
&KIND C
BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
POTENTIAL GTH-PBE-q4
&END KIND
&KIND H
BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
POTENTIAL GTH-PBE-q1
&END KIND
&END SUBSYS
&END FORCE_EVAL
&MOTION
&MD
ENSEMBLE NPT_I
TEMPERATURE 298
TIMESTEP 1.0
STEPS 50000
&THERMOSTAT
TYPE NOSE
&NOSE
LENGTH 3
YOSHIDA 3
TIMECON 1000
&END NOSE
&END THERMOSTAT
&BAROSTAT
PRESSURE 1.0
TIMECON 4000
&END BAROSTAT
&END MD
&FREE_ENERGY
METHOD METADYN
&METADYN
USE_PLUMED .TRUE.
PLUMED_INPUT_FILE plumed.dat
&END METADYN
&END FREE_ENERGY
&PRINT
&TRAJECTORY
&EACH
MD 5
&END EACH
&END TRAJECTORY
&FORCES
UNIT eV*angstrom^-1
&EACH
MD 5
&END EACH
&END FORCES
&CELL
&EACH
MD 5
&END EACH
&END CELL
&END PRINT
&END MOTION
```
This simulation was performed with previous version of cp2k (so without
your fix).
piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):
> Hi Frederick,
>
> it helped with most of the tests! Now only 13 have failed. In the
> attachments you will find full output from regtests and here is output from
> single job with TRACE enabled:
>
> ```
> Loading intel/2024a
> Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
> binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
> numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
> impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
> imkl-FFTW/2024.2.0-iimpi-2024a
>
> Currently Loaded Modulefiles:
> 1) GCCcore/13.3.0 7)
> impi/2021.13.0-intel-compilers-2024.2.0
> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>
> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>
> 4) intel-compilers/2024.2.0 10) imkl-FFTW/2024.2.0-iimpi-2024a
>
> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>
> 6) UCX/1.16.0-GCCcore-13.3.0
> 2 MPI processes with 2 OpenMP threads each
> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
> SIRIUS 7.6.1, git hash:
> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
> Warning! Compiled in 'debug' mode with assert statements enabled!
>
>
> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
> CLX/DP TRY JIT STA COL
> 0..13 8 8 0 0
> 14..23 0 0 0 0
> 24..64 0 0 0 0
> Registry and code: 13 MB + 64 KB (gemm=8)
> Command (PID=423503):
> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
> dftd3src1.inp -o dftd3src1.out
> Uptime: 2.752513 s
>
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = RANK 0 PID 423503 RUNNING AT r21c01b03
>
> = KILLED BY SIGNAL: 11 (Segmentation fault)
>
> ===================================================================================
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = RANK 1 PID 423504 RUNNING AT r21c01b03
>
> = KILLED BY SIGNAL: 9 (Killed)
>
> ===================================================================================
> finished at Fri Oct 25 09:34:39 CEST 2024
> ```
>
> and the last lines:
>
> ```
> 000000:000002<< 13 3
> mp_sendrecv_dm2
> 0.000 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 13 4
> mp_sendrecv_dm2
> start Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 13 4
> mp_sendrecv_dm2
> 0.000 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 12 2 pw_nn_compose_r
> 0
> .003 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 11 1 xc_pw_derive
> 0.003 H
> ostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 11 5 pw_zero start
> Hostme
> m: 955 MB GPUmem: 0 MB
> 000000:000002<< 11 5 pw_zero 0.000
> Hostme
> m: 955 MB GPUmem: 0 MB
> 000000:000002>> 11 2 xc_pw_derive
> start H
> ostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 12 3 pw_nn_compose_r
> s
> tart Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 13 5
> mp_sendrecv_dm2
> start Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 13 5
> mp_sendrecv_dm2
> 0.000 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 13 6
> mp_sendrecv_dm2
> start Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 13 6
> mp_sendrecv_dm2
> 0.000 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 12 3 pw_nn_compose_r
> 0
> .002 Hostmem: 955 MB GPUmem: 0 MB
> 000000:000002<< 11 2 xc_pw_derive
> 0.002 H
> ostmem: 955 MB GPUmem: 0 MB
> 000000:000002>> 11 6 pw_zero start
> Hostme
> m: 955 MB GPUmem: 0 MB
> 000000:000002<< 11 6 pw_zero 0.001
> Hostme
> m: 960 MB GPUmem: 0 MB
> 000000:000002>> 11 3 xc_pw_derive
> start H
> ostmem: 960 MB GPUmem: 0 MB
> 000000:000002>> 12 4 pw_nn_compose_r
> s
> tart Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002>> 13 7
> mp_sendrecv_dm2
> start Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002<< 13 7
> mp_sendrecv_dm2
> 0.000 Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002>> 13 8
> mp_sendrecv_dm2
> start Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002<< 13 8
> mp_sendrecv_dm2
> 0.000 Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002<< 12 4 pw_nn_compose_r
> 0
> .002 Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002<< 11 3 xc_pw_derive
> 0.002 H
> ostmem: 960 MB GPUmem: 0 MB
> 000000:000002>> 11 1
> pw_spline_scale_deriv
> start Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002<< 11 1
> pw_spline_scale_deriv
> 0.001 Hostmem: 960 MB GPUmem: 0 MB
> 000000:000002>> 11 20 pw_pool_give_back_pw
>
> start Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002<< 11 20 pw_pool_give_back_pw
>
> 0.000 Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002>> 11 21 pw_pool_give_back_pw
>
> start Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002<< 11 21 pw_pool_give_back_pw
>
> 0.000 Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002>> 11 22 pw_pool_give_back_pw
>
> start Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002<< 11 22 pw_pool_give_back_pw
>
> 0.000 Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002>> 11 23 pw_pool_give_back_pw
>
> start Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002<< 11 23 pw_pool_give_back_pw
>
> 0.000 Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002>> 11 1 xc_functional_eval
> s
> tart Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002>> 12 1 b97_lda_eval
> star
> t Hostmem: 965 MB GPUmem: 0 MB
> 000000:000002<< 12 1 b97_lda_eval
> 0.10
> 3 Hostmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 11 1 xc_functional_eval
> 0
> .103 Hostmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 1
> xc_rho_set_and_dset_create
> 0.120 Hostmem: 979 MB GPUmem: 0 MB
> 000000:000002>> 10 1 check_for_derivatives
> s
> tart Hostmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 1 check_for_derivatives
> 0
> .000 Hostmem: 979 MB GPUmem: 0 MB
> 000000:000002>> 10 14 pw_create_r3d
> start Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 14 pw_create_r3d
> 0.000 Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002>> 10 15 pw_create_r3d
> start Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 15 pw_create_r3d
> 0.000 Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002>> 10 16 pw_create_r3d
> start Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 16 pw_create_r3d
> 0.000 Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002>> 10 17 pw_create_r3d
> start Hos
> tmem: 979 MB GPUmem: 0 MB
> 000000:000002<< 10 17 pw_create_r3d
> 0.000 Hos
> tmem: 979 MB GPUmem: 0 MB
> ```
>
> Best
> Bartosz
>
> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein napisał(a):
>
>> Dear Bartosz,
>> My fix is merged. Can you switch to the CP2K master and try it again? We
>> are still working on a few issues with the Intel compilers such that we may
>> eventually migrate from ifort to ifx.
>> Best,
>> Frederick
>>
>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>
>>> Great! Thank you for your help.
>>>
>>> Best
>>> Bartosz
>>>
>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein napisał(a):
>>>
>>>> I have a fix for it. In contrast to my first thought, it is a case of
>>>> invalid type conversion from real to complex numbers (yes, Fortran is
>>>> rather strict about it) in pw_derive. This may also be present in a few
>>>> other spots. I am currently running more tests and I will open a pull
>>>> request within the next few days.
>>>> Best,
>>>> Frederick
>>>>
>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 UTC+2:
>>>>
>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>
>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>>>>
>>>>>> I was loading it as it was needed for compilation. I have unloaded
>>>>>> the module, but the error still occurs:
>>>>>>
>>>>>> ```
>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>> CLX/DP TRY JIT STA COL
>>>>>> 0..13 2 2 0 0
>>>>>> 14..23 0 0 0 0
>>>>>> 24..64 0 0 0 0
>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>> Command (PID=15485):
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>> H2O-9.inp -o H2O-9.out
>>>>>> Uptime: 1.757102 s
>>>>>>
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> = RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>
>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>
>>>>>> ===================================================================================
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> = RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>
>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>
>>>>>> ===================================================================================
>>>>>> ```
>>>>>>
>>>>>>
>>>>>> and the last 100 lines:
>>>>>>
>>>>>> ```
>>>>>> 000000:000002>> 11 37 pw_create_c1d
>>>>>> start
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 37 pw_create_c1d
>>>>>> 0.000
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 10 64 pw_pool_create_pw
>>>>>> 0.000
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 10 25 pw_copy
>>>>>> start Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 10 25 pw_copy
>>>>>> 0.001 Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 10 17 pw_axpy
>>>>>> start Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 10 17 pw_axpy
>>>>>> 0.001 Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 10 19 mp_sum_d
>>>>>> start Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 10 19 mp_sum_d
>>>>>> 0.000 Hostmem:
>>>>>> 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 10 3 pw_poisson_solve
>>>>>> start
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 3
>>>>>> pw_poisson_rebuild s
>>>>>> tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 3
>>>>>> pw_poisson_rebuild 0
>>>>>> .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 65
>>>>>> pw_pool_create_pw st
>>>>>> art Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 38
>>>>>> pw_create_c1d sta
>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 38
>>>>>> pw_create_c1d 0.0
>>>>>> 00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 65
>>>>>> pw_pool_create_pw 0.
>>>>>> 000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 26 pw_copy
>>>>>> start Hostme
>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 26 pw_copy
>>>>>> 0.001 Hostme
>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 3
>>>>>> pw_multiply_with sta
>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 3
>>>>>> pw_multiply_with 0.0
>>>>>> 01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 27 pw_copy
>>>>>> start Hostme
>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 27 pw_copy
>>>>>> 0.001 Hostme
>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 3 pw_integral_ab
>>>>>> start
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 20 mp_sum_d
>>>>>> start Ho
>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 20 mp_sum_d
>>>>>> 0.001 Ho
>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 11 3 pw_integral_ab
>>>>>> 0.004
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 11 4 pw_poisson_set
>>>>>> start
>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 66
>>>>>> pw_pool_create_pw
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 39
>>>>>> pw_create_c1d
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 39
>>>>>> pw_create_c1d
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 66
>>>>>> pw_pool_create_pw
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 28 pw_copy
>>>>>> start Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 28 pw_copy
>>>>>> 0.001 Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 7 pw_derive
>>>>>> start H
>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 7 pw_derive
>>>>>> 0.002 H
>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 67
>>>>>> pw_pool_create_pw
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 40
>>>>>> pw_create_c1d
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 40
>>>>>> pw_create_c1d
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 67
>>>>>> pw_pool_create_pw
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 29 pw_copy
>>>>>> start Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 29 pw_copy
>>>>>> 0.001 Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 8 pw_derive
>>>>>> start H
>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 8 pw_derive
>>>>>> 0.002 H
>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 68
>>>>>> pw_pool_create_pw
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 13 41
>>>>>> pw_create_c1d
>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 13 41
>>>>>> pw_create_c1d
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 68
>>>>>> pw_pool_create_pw
>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 30 pw_copy
>>>>>> start Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002<< 12 30 pw_copy
>>>>>> 0.001 Hos
>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>> 000000:000002>> 12 9 pw_derive
>>>>>> start H
>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>> ```
>>>>>>
>>>>>> This is the list of currently loaded modules (all come with intel):
>>>>>>
>>>>>> ```
>>>>>> Currently Loaded Modulefiles:
>>>>>> 1) GCCcore/13.3.0 7)
>>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>>
>>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>>
>>>>>> 4) intel-compilers/2024.2.0 10) imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>
>>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>>
>>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>>> ```
>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein
>>>>>> napisał(a):
>>>>>>
>>>>>>> Dear Bartosz,
>>>>>>> I am currently running some tests with the latest Intel compiler
>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why
>>>>>>> is it loaded? Can you unload it? This would at least reduce potential
>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>> Best,
>>>>>>> Frederick
>>>>>>>
>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>>>>
>>>>>>>> The error for ssmp is:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>> 0..13 4 4 0 0
>>>>>>>> 14..23 0 0 0 0
>>>>>>>> 24..64 0 0 0 0
>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>> Command (PID=54845):
>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>> Uptime: 2.861583 s
>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 54845
>>>>>>>> Segmentation fault (core dumped)
>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>> ```
>>>>>>>>
>>>>>>>> and the last 100 lines of output:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> 000000:000001>> 12 20 mp_sum_d
>>>>>>>> start Ho
>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 20 mp_sum_d
>>>>>>>> 0.000 Ho
>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 11 13 dbcsr_dot_sd
>>>>>>>> 0.000 H
>>>>>>>> ostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 10 12
>>>>>>>> calculate_ptrace_kp 0.0
>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 9 6
>>>>>>>> evaluate_core_matrix_traces
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 9 6 rebuild_ks_matrix
>>>>>>>> start Ho
>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 10 6
>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 11 140
>>>>>>>> pw_pool_create_pw st
>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 79
>>>>>>>> pw_create_c1d sta
>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 79
>>>>>>>> pw_create_c1d 0.0
>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 11 140
>>>>>>>> pw_pool_create_pw 0.
>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 11 141
>>>>>>>> pw_pool_create_pw st
>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 80
>>>>>>>> pw_create_c1d sta
>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 80
>>>>>>>> pw_create_c1d 0.0
>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 11 141
>>>>>>>> pw_pool_create_pw 0.
>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 11 61 pw_copy
>>>>>>>> start Hostme
>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 11 61 pw_copy
>>>>>>>> 0.004 Hostme
>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 11 35 pw_axpy
>>>>>>>> start Hostme
>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 11 35 pw_axpy
>>>>>>>> 0.002 Hostme
>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 11 6
>>>>>>>> pw_poisson_solve sta
>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 6
>>>>>>>> pw_poisson_rebuild
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 6
>>>>>>>> pw_poisson_rebuild
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 142
>>>>>>>> pw_pool_create_pw
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 81
>>>>>>>> pw_create_c1d
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 81
>>>>>>>> pw_create_c1d
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 142
>>>>>>>> pw_pool_create_pw
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 62 pw_copy
>>>>>>>> start Hos
>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 62 pw_copy
>>>>>>>> 0.003 Hos
>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 6
>>>>>>>> pw_multiply_with
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 6
>>>>>>>> pw_multiply_with
>>>>>>>> 0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 63 pw_copy
>>>>>>>> start Hos
>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 63 pw_copy
>>>>>>>> 0.003 Hos
>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 6
>>>>>>>> pw_integral_ab st
>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 12 6
>>>>>>>> pw_integral_ab 0.
>>>>>>>> 005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 12 7
>>>>>>>> pw_poisson_set st
>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 143
>>>>>>>> pw_pool_create_pw
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 14 82
>>>>>>>> pw_create_c1d
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 14 82
>>>>>>>> pw_create_c1d
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 143
>>>>>>>> pw_pool_create_pw
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 64 pw_copy
>>>>>>>> start
>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 64 pw_copy
>>>>>>>> 0.003
>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 16
>>>>>>>> pw_derive star
>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 16
>>>>>>>> pw_derive 0.00
>>>>>>>> 6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 144
>>>>>>>> pw_pool_create_pw
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 14 83
>>>>>>>> pw_create_c1d
>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 14 83
>>>>>>>> pw_create_c1d
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 144
>>>>>>>> pw_pool_create_pw
>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 65 pw_copy
>>>>>>>> start
>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001<< 13 65 pw_copy
>>>>>>>> 0.004
>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> 000000:000001>> 13 17
>>>>>>>> pw_derive star
>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> ```
>>>>>>>>
>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> 000000:000002<< 9 7
>>>>>>>> evaluate_core_matrix_traces
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 9 7 rebuild_ks_matrix
>>>>>>>> start Ho
>>>>>>>>
>>>>>>>> stmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 10 7
>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 164
>>>>>>>> pw_pool_create_pw st
>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 93
>>>>>>>> pw_create_c1d sta
>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 93
>>>>>>>> pw_create_c1d 0.0
>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 11 164
>>>>>>>> pw_pool_create_pw 0.
>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 165
>>>>>>>> pw_pool_create_pw st
>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 94
>>>>>>>> pw_create_c1d sta
>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 94
>>>>>>>> pw_create_c1d 0.0
>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 11 165
>>>>>>>> pw_pool_create_pw 0.
>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 73 pw_copy
>>>>>>>> start Hostme
>>>>>>>>
>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 11 73 pw_copy
>>>>>>>> 0.001 Hostme
>>>>>>>>
>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 41 pw_axpy
>>>>>>>> start Hostme
>>>>>>>>
>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 11 41 pw_axpy
>>>>>>>> 0.001 Hostme
>>>>>>>>
>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 52 mp_sum_d
>>>>>>>> start Hostm
>>>>>>>>
>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 11 52 mp_sum_d
>>>>>>>> 0.000 Hostm
>>>>>>>>
>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 11 7
>>>>>>>> pw_poisson_solve sta
>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 7
>>>>>>>> pw_poisson_rebuild
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 7
>>>>>>>> pw_poisson_rebuild
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 166
>>>>>>>> pw_pool_create_pw
>>>>>>>>
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 95
>>>>>>>> pw_create_c1d
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 95
>>>>>>>> pw_create_c1d
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 166
>>>>>>>> pw_pool_create_pw
>>>>>>>>
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 74 pw_copy
>>>>>>>> start Hos
>>>>>>>>
>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 74 pw_copy
>>>>>>>> 0.001 Hos
>>>>>>>>
>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 7
>>>>>>>> pw_multiply_with
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 7
>>>>>>>> pw_multiply_with
>>>>>>>> 0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 75 pw_copy
>>>>>>>> start Hos
>>>>>>>>
>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 75 pw_copy
>>>>>>>> 0.001 Hos
>>>>>>>>
>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 7
>>>>>>>> pw_integral_ab st
>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 53
>>>>>>>> mp_sum_d start
>>>>>>>>
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 53
>>>>>>>> mp_sum_d 0.000
>>>>>>>>
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 12 7
>>>>>>>> pw_integral_ab 0.
>>>>>>>> 003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 12 8
>>>>>>>> pw_poisson_set st
>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 167
>>>>>>>> pw_pool_create_pw
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 14 96
>>>>>>>> pw_create_c1d
>>>>>>>>
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 14 96
>>>>>>>> pw_create_c1d
>>>>>>>>
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 167
>>>>>>>> pw_pool_create_pw
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 76 pw_copy
>>>>>>>> start
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 76 pw_copy
>>>>>>>> 0.001
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 19
>>>>>>>> pw_derive star
>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 19
>>>>>>>> pw_derive 0.00
>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 168
>>>>>>>> pw_pool_create_pw
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 14 97
>>>>>>>> pw_create_c1d
>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 14 97
>>>>>>>> pw_create_c1d
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 168
>>>>>>>> pw_pool_create_pw
>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 77 pw_copy
>>>>>>>> start
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002<< 13 77 pw_copy
>>>>>>>> 0.001
>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> 000000:000002>> 13 20
>>>>>>>> pw_derive star
>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> ```
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Bartosz
>>>>>>>>
>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick Stein
>>>>>>>> napisał(a):
>>>>>>>>
>>>>>>>>> Dear Bartosz,
>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>> Regarding the trace, I do not know either as there is not much
>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the
>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually
>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last
>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces
>>>>>>>>> with the psmp version.
>>>>>>>>> Best,
>>>>>>>>> Frederick
>>>>>>>>>
>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> The error is:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>
>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>> Command (PID=2607388):
>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> = RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>
>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> = RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>> pw_copy 0.001
>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>> pw_derive star
>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>> pw_derive 0.00
>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 14 97
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 14 97
>>>>>>>>>> pw_create_c1d
>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>> pw_copy start
>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>> pw_copy 0.001
>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>> pw_derive star
>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE
>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This
>>>>>>>>>>> increases the size of the output file dramatically (to some million lines).
>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I
>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for
>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp
>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same
>>>>>>>>>>>> setting, I provide example output as attachment.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>
>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein
>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 (add
>>>>>>>>>>>>> '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of the
>>>>>>>>>>>>> ssmp?
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 15:37:43
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks again for help. So I have tested different simulation
>>>>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI
>>>>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of
>>>>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart
>>>>>>>>>>>>>> from the effect on simulation time, they have no significant effect on the
>>>>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed,
>>>>>>>>>>>>>> time
>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130;
>>>>>>>>>>>>>> 495min
>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94;
>>>>>>>>>>>>>> 484min
>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74;
>>>>>>>>>>>>>> 563min
>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74;
>>>>>>>>>>>>>> 556min
>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121;
>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227;
>>>>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas what I could do next to have more information about
>>>>>>>>>>>>>> the source of the problem or maybe you see a potential solution at this
>>>>>>>>>>>>>> stage? I would appreciate any further help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick Stein
>>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test do
>>>>>>>>>>>>>>> not run that efficiently with such a large number of threads. 2 should be
>>>>>>>>>>>>>>> sufficient.
>>>>>>>>>>>>>>> The test result suggests that most of the functionality may
>>>>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to
>>>>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests
>>>>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um
>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/7042b62f-62de-43ad-ad94-b940977c9e2an%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241025/e870b8d5/attachment-0001.htm>
More information about the CP2K-user
mailing list