[CP2K-user] [CP2K:20818] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types
Frederick Stein
f.stein at hzdr.de
Fri Oct 25 09:46:00 UTC 2024
Dear Bartosz,
I will check the other issues with your regtests.
Regarding your latest issue, please provide more information such as an
output file or a hint on the context. If I am supposed to retry the
calculation on my local machine, I need all additional input files such as
your plumed file. I can run your input file up to the point that CP2K needs
plumed.
Best,
Frederick
bartosz mazur schrieb am Freitag, 25. Oktober 2024 um 10:15:19 UTC+2:
> I just got another error with LibXSMM, now in my regular simulation and
> without using OpenMP. This is the error:
>
> ```
> [1729843139.920274] [r23c01b04:2913 :0] ib_md.c:295 UCX ERROR
> ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed: Cannot
> allocate memory
> [1729843139.920290] [r23c01b04:2913 :0] ucp_mm.c:70 UCX ERROR
> failed to register address 0x14f0b46fc080 (host) length 7424 on
> md[4]=mlx5_0: Input/output error (md supports: host)
>
> LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647]
> [r23c01b04:2945 :0] ib_md.c:295 UCX ERROR
> ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot
> allocate memory
> [1729843139.932660] [r23c01b04:2945 :0] ucp_mm.c:70 UCX ERROR
> failed to register address 0x1491f069e040 (host) length 8128 on
> md[4]=mlx5_0: Input/output error (md supports: host)
>
>
> CLX/DP TRY JIT STA COL
> 0..13 4 4 0 0
> 14..23 4 4 0 0
>
> 24..64 0 0 0 0
> Registry and code: 13 MB + 80 KB (gemm=8)
> Command (PID=2913):
> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
> cp2k.inp -o cp2k.out
> Uptime: 407633.177169 s
> ```
>
> and this is simulation input I'm using:
>
> ```
> &GLOBAL
> PROJECT uam1o_npt_rms
> RUN_TYPE MD
> PRINT_LEVEL LOW
> PREFERRED_DIAG_LIBRARY SCALAPACK
> &END GLOBAL
>
> &FORCE_EVAL
> METHOD QUICKSTEP
> STRESS_TENSOR ANALYTICAL
> &DFT
> BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
> POTENTIAL_FILE_NAME POTENTIAL_UZH
> &MGRID
> CUTOFF 500
> &END MGRID
> &XC
> &XC_FUNCTIONAL PBE
> &END XC_FUNCTIONAL
> &VDW_POTENTIAL
> POTENTIAL_TYPE PAIR_POTENTIAL
> &PAIR_POTENTIAL
> TYPE DFTD3(BJ)
> PARAMETER_FILE_NAME dftd3.dat
> REFERENCE_FUNCTIONAL PBE
> R_CUTOFF 25.0
> &END PAIR_POTENTIAL
> &END VDW_POTENTIAL
> &END XC
> &END DFT
>
> &SUBSYS
> &CELL
> A 12.2807999 0.0000000 0.0000000
> B 7.6258602 9.6257200 0.0000000
> C -2.1557724 -1.0420258 18.0042801
> &END CELL
> &COORD
> Zn 11.37811 4.60286 0.24515
> Zn 8.15435 3.05288 8.74518
> Zn 6.37590 3.97311 17.74650
> Zn 9.59842 5.54014 9.24747
> S 11.79344 6.72692 17.10850
> S 4.06825 3.00573 9.90358
> S 5.95830 1.84422 0.90027
> S 13.67407 5.58944 8.10767
> O 10.72408 3.58291 1.89315
> O 8.51986 4.01962 1.53085
> O 6.60135 3.91587 7.68572
> O 7.74637 5.79259 8.21600
> O 15.32810 8.58246 5.10041
> O 9.35608 2.93551 7.09500
> O 10.38999 4.93007 7.45977
> O 11.66491 6.35111 1.31266
> O 9.48582 6.62478 0.77364
> O 2.59062 2.40094 3.91496
> O 7.03031 4.99173 16.09885
> O 9.23544 4.56122 16.46252
> O 11.14602 4.67776 10.31440
> O 10.00982 2.79915 9.77218
> O 2.41388 0.01898 12.91899
> O 8.39375 5.66143 10.89628
> O 7.36998 3.66087 10.53589
> O 6.08863 2.22161 16.68336
> O 8.26988 1.95313 17.21650
> O 15.16937 6.16381 14.09906
> N 13.25907 3.80728 0.04001
> N 2.36335 -0.74130 17.33402
> N 7.60676 1.08576 8.95623
> N 15.77729 5.75974 9.67861
> N 4.49430 4.76652 17.95756
> N 15.38873 9.31230 0.67467
> N 10.14308 7.50848 9.04236
> N 1.96529 2.83557 8.33233
> C 6.76554 5.18292 7.68414
> C 14.28210 4.11624 0.86006
> C 9.47998 3.39622 2.09658
> C 3.20112 3.42080 0.84626
> C 9.91466 1.18589 3.17244
> C 9.08210 2.29987 3.02657
> C 5.74710 6.04945 7.01821
> C 7.83265 2.30920 3.66005
> C 3.35793 2.34328 -0.04029
> C 4.51663 1.46385 -0.02755
> C 16.24194 7.75266 5.73606
> C 4.78940 5.52817 6.14198
> C 7.40810 1.21174 4.39947
> C 16.18016 6.38244 5.49010
> C 9.48869 0.06986 3.88005
> C 11.27238 1.77457 17.14330
> C 5.77166 7.43009 7.27236
> C 11.14819 8.24901 17.58588
> C 8.22170 0.08058 4.47135
> C 0.15087 1.02286 17.07544
> C 17.16180 8.28565 6.64351
> C 10.57067 7.01060 1.31282
> C 6.72654 0.47459 8.14002
> C 10.27972 3.79035 6.89470
> C 14.15006 8.72843 8.15880
> C 11.73751 2.06868 5.82537
> C 11.38838 3.41515 5.96966
> C 10.52304 8.34339 1.98566
> C 12.16584 4.39562 5.33967
> C 14.89762 7.93801 9.04648
> C 14.86698 6.48365 9.03575
> C 2.67167 1.17044 3.27681
> C 11.52468 8.76552 2.86608
> C 13.29140 4.04007 4.60622
> C 3.78230 0.36534 3.52266
> C 12.87823 1.70260 5.12344
> C 8.27761 0.34001 9.85941
> C 9.42677 9.18364 1.73295
> C 3.27553 4.45658 9.42657
> C 13.66559 2.69775 4.53650
> C 15.77023 8.59069 9.93240
> C 1.68356 0.78491 2.36643
> C 10.98451 3.41041 10.31327
> C 3.46873 4.45681 17.14097
> C 8.27403 5.18373 15.89814
> C 14.54907 5.15099 17.15930
> C 7.83119 7.39584 14.82858
> C 8.66916 6.28563 14.97331
> C 11.99928 2.54577 10.98702
> C 9.92072 6.28547 14.34388
> C 16.54982 7.26986 0.04271
> C 15.39103 8.14919 0.03189
> C 1.50023 0.84646 12.27989
> C 12.95126 3.06908 11.86817
> C 10.34198 7.38826 13.61070
> C 1.55836 2.21699 12.52561
> C 8.25354 8.51697 14.12666
> C 6.48249 6.79770 0.85630
> C 11.97760 1.16465 10.73446
> C 6.60385 0.32218 0.42301
> C 9.52282 8.51550 13.54043
> C 17.60321 7.54791 0.92891
> C 0.58530 0.31102 11.36884
> C 7.18362 1.56332 16.68291
> C 11.01926 8.11905 9.86341
> C 7.47582 4.80132 11.10039
> C 3.59282 -0.13430 9.84955
> C 6.01179 6.51430 12.17471
> C 6.36853 5.17005 12.02942
> C 7.23131 0.22715 16.01652
> C 5.59963 4.18477 12.66234
> C 2.84614 0.65728 8.96213
> C 2.87561 2.11161 8.97508
> C 15.08536 7.39548 14.73440
> C 6.23001 -0.19920 15.13769
> C 4.47482 4.53325 13.40042
> C 13.97400 8.19851 14.48576
> C 4.87173 6.87322 12.88120
> C 9.47231 8.25578 8.14046
> C 8.32790 -0.61137 16.27301
> C 14.46698 4.13864 8.58475
> C 4.09294 5.87331 13.47165
> C 1.97640 0.00563 8.07267
> C 16.07240 7.78504 15.64417
> H 14.10215 4.93465 1.55678
> H 3.98110 3.68721 1.55899
> H 10.89072 1.19647 2.69205
> H 7.19958 3.19021 3.56839
> H 4.75923 4.45384 5.96230
> H 6.45299 1.21835 4.92062
> H 15.44211 6.00062 4.78824
> H 17.75043 8.81610 3.97156
> H 10.41563 1.57993 16.49923
> H 6.49332 7.81303 7.99143
> H 0.24800 0.19739 16.37425
> H 9.53586 -0.26872 6.84508
> H 6.19685 1.12218 7.44173
> H 13.45550 8.28133 7.44815
> H 11.11633 1.31384 6.30260
> H 11.87413 5.44074 5.42962
> H 12.38442 8.12016 3.04474
> H 13.88694 4.78876 4.08791
> H 4.53915 0.70283 4.22717
> H 0.88557 0.65625 5.03328
> H 8.96418 0.89159 10.50060
> H 8.67994 8.85961 1.01083
> H 16.35704 8.00331 10.63471
> H 13.12606 1.45212 2.16563
> H 3.64702 3.63930 16.44281
> H 13.76743 4.88477 16.44833
> H 6.85355 7.37827 15.30535
> H 10.55820 5.40745 14.43410
> H 12.97886 4.14375 12.04672
> H 11.29905 7.38966 13.09313
> H 2.29216 2.60091 13.23073
> H -0.01303 -0.23279 14.03603
> H 7.34113 6.99275 1.49776
> H 11.26049 0.78023 10.01184
> H 17.50743 8.37258 1.63130
> H 8.21398 8.86531 11.16822
> H 11.54834 7.47018 10.56097
> H 4.28503 0.31205 10.56295
> H 6.62643 7.27289 11.69479
> H 5.89748 3.14154 12.57118
> H 5.36986 0.44461 14.95599
> H 3.88656 3.78035 13.92095
> H 13.21826 7.85764 13.78163
> H 16.85773 7.91771 12.97237
> H 8.78884 7.70469 7.49554
> H 9.07452 -0.28399 16.99402
> H 1.39009 0.59398 7.37083
> H 4.63062 7.11938 15.84758
> &END COORD
> &KIND Zn
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
> POTENTIAL GTH-PBE-q12
> &END KIND
> &KIND S
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
> POTENTIAL GTH-PBE-q6
> &END KIND
> &KIND O
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
> POTENTIAL GTH-PBE-q6
> &END KIND
> &KIND N
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
> POTENTIAL GTH-PBE-q5
> &END KIND
> &KIND C
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
> POTENTIAL GTH-PBE-q4
> &END KIND
> &KIND H
> BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
> POTENTIAL GTH-PBE-q1
> &END KIND
> &END SUBSYS
> &END FORCE_EVAL
>
> &MOTION
> &MD
> ENSEMBLE NPT_I
> TEMPERATURE 298
> TIMESTEP 1.0
> STEPS 50000
> &THERMOSTAT
> TYPE NOSE
> &NOSE
> LENGTH 3
> YOSHIDA 3
> TIMECON 1000
> &END NOSE
> &END THERMOSTAT
> &BAROSTAT
> PRESSURE 1.0
> TIMECON 4000
> &END BAROSTAT
> &END MD
> &FREE_ENERGY
> METHOD METADYN
> &METADYN
> USE_PLUMED .TRUE.
> PLUMED_INPUT_FILE plumed.dat
> &END METADYN
> &END FREE_ENERGY
> &PRINT
> &TRAJECTORY
> &EACH
> MD 5
> &END EACH
> &END TRAJECTORY
> &FORCES
> UNIT eV*angstrom^-1
> &EACH
> MD 5
> &END EACH
> &END FORCES
> &CELL
> &EACH
> MD 5
> &END EACH
> &END CELL
> &END PRINT
> &END MOTION
> ```
>
> This simulation was performed with previous version of cp2k (so without
> your fix).
> piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):
>
>> Hi Frederick,
>>
>> it helped with most of the tests! Now only 13 have failed. In the
>> attachments you will find full output from regtests and here is output from
>> single job with TRACE enabled:
>>
>> ```
>> Loading intel/2024a
>> Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>> binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>> numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>> impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>> imkl-FFTW/2024.2.0-iimpi-2024a
>>
>> Currently Loaded Modulefiles:
>> 1) GCCcore/13.3.0 7)
>> impi/2021.13.0-intel-compilers-2024.2.0
>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>
>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>
>> 4) intel-compilers/2024.2.0 10) imkl-FFTW/2024.2.0-iimpi-2024a
>>
>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>
>> 6) UCX/1.16.0-GCCcore-13.3.0
>> 2 MPI processes with 2 OpenMP threads each
>> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
>> SIRIUS 7.6.1, git hash:
>> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
>> Warning! Compiled in 'debug' mode with assert statements enabled!
>>
>>
>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>> CLX/DP TRY JIT STA COL
>> 0..13 8 8 0 0
>> 14..23 0 0 0 0
>> 24..64 0 0 0 0
>> Registry and code: 13 MB + 64 KB (gemm=8)
>> Command (PID=423503):
>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>> dftd3src1.inp -o dftd3src1.out
>> Uptime: 2.752513 s
>>
>>
>>
>> ===================================================================================
>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> = RANK 0 PID 423503 RUNNING AT r21c01b03
>>
>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>
>> ===================================================================================
>>
>>
>> ===================================================================================
>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> = RANK 1 PID 423504 RUNNING AT r21c01b03
>>
>> = KILLED BY SIGNAL: 9 (Killed)
>>
>> ===================================================================================
>> finished at Fri Oct 25 09:34:39 CEST 2024
>> ```
>>
>> and the last lines:
>>
>> ```
>> 000000:000002<< 13 3
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 13 4
>> mp_sendrecv_dm2
>> start Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 13 4
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 12 2 pw_nn_compose_r
>> 0
>> .003 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 11 1 xc_pw_derive
>> 0.003 H
>> ostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 11 5 pw_zero start
>> Hostme
>> m: 955 MB GPUmem: 0 MB
>> 000000:000002<< 11 5 pw_zero 0.000
>> Hostme
>> m: 955 MB GPUmem: 0 MB
>> 000000:000002>> 11 2 xc_pw_derive
>> start H
>> ostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 12 3 pw_nn_compose_r
>> s
>> tart Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 13 5
>> mp_sendrecv_dm2
>> start Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 13 5
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 13 6
>> mp_sendrecv_dm2
>> start Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 13 6
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 12 3 pw_nn_compose_r
>> 0
>> .002 Hostmem: 955 MB GPUmem: 0 MB
>> 000000:000002<< 11 2 xc_pw_derive
>> 0.002 H
>> ostmem: 955 MB GPUmem: 0 MB
>> 000000:000002>> 11 6 pw_zero start
>> Hostme
>> m: 955 MB GPUmem: 0 MB
>> 000000:000002<< 11 6 pw_zero 0.001
>> Hostme
>> m: 960 MB GPUmem: 0 MB
>> 000000:000002>> 11 3 xc_pw_derive
>> start H
>> ostmem: 960 MB GPUmem: 0 MB
>> 000000:000002>> 12 4 pw_nn_compose_r
>> s
>> tart Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002>> 13 7
>> mp_sendrecv_dm2
>> start Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002<< 13 7
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002>> 13 8
>> mp_sendrecv_dm2
>> start Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002<< 13 8
>> mp_sendrecv_dm2
>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002<< 12 4 pw_nn_compose_r
>> 0
>> .002 Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002<< 11 3 xc_pw_derive
>> 0.002 H
>> ostmem: 960 MB GPUmem: 0 MB
>> 000000:000002>> 11 1
>> pw_spline_scale_deriv
>> start Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002<< 11 1
>> pw_spline_scale_deriv
>> 0.001 Hostmem: 960 MB GPUmem: 0 MB
>> 000000:000002>> 11 20
>> pw_pool_give_back_pw
>> start Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002<< 11 20
>> pw_pool_give_back_pw
>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002>> 11 21
>> pw_pool_give_back_pw
>> start Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002<< 11 21
>> pw_pool_give_back_pw
>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002>> 11 22
>> pw_pool_give_back_pw
>> start Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002<< 11 22
>> pw_pool_give_back_pw
>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002>> 11 23
>> pw_pool_give_back_pw
>> start Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002<< 11 23
>> pw_pool_give_back_pw
>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002>> 11 1 xc_functional_eval
>> s
>> tart Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002>> 12 1 b97_lda_eval
>> star
>> t Hostmem: 965 MB GPUmem: 0 MB
>> 000000:000002<< 12 1 b97_lda_eval
>> 0.10
>> 3 Hostmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 11 1 xc_functional_eval
>> 0
>> .103 Hostmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 1
>> xc_rho_set_and_dset_create
>> 0.120 Hostmem: 979 MB GPUmem: 0 MB
>> 000000:000002>> 10 1 check_for_derivatives
>> s
>> tart Hostmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 1 check_for_derivatives
>> 0
>> .000 Hostmem: 979 MB GPUmem: 0 MB
>> 000000:000002>> 10 14 pw_create_r3d
>> start Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 14 pw_create_r3d
>> 0.000 Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002>> 10 15 pw_create_r3d
>> start Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 15 pw_create_r3d
>> 0.000 Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002>> 10 16 pw_create_r3d
>> start Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 16 pw_create_r3d
>> 0.000 Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002>> 10 17 pw_create_r3d
>> start Hos
>> tmem: 979 MB GPUmem: 0 MB
>> 000000:000002<< 10 17 pw_create_r3d
>> 0.000 Hos
>> tmem: 979 MB GPUmem: 0 MB
>> ```
>>
>> Best
>> Bartosz
>>
>> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein napisał(a):
>>
>>> Dear Bartosz,
>>> My fix is merged. Can you switch to the CP2K master and try it again? We
>>> are still working on a few issues with the Intel compilers such that we may
>>> eventually migrate from ifort to ifx.
>>> Best,
>>> Frederick
>>>
>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>>
>>>> Great! Thank you for your help.
>>>>
>>>> Best
>>>> Bartosz
>>>>
>>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein
>>>> napisał(a):
>>>>
>>>>> I have a fix for it. In contrast to my first thought, it is a case of
>>>>> invalid type conversion from real to complex numbers (yes, Fortran is
>>>>> rather strict about it) in pw_derive. This may also be present in a few
>>>>> other spots. I am currently running more tests and I will open a pull
>>>>> request within the next few days.
>>>>> Best,
>>>>> Frederick
>>>>>
>>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49
>>>>> UTC+2:
>>>>>
>>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>>
>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>>>>>
>>>>>>> I was loading it as it was needed for compilation. I have unloaded
>>>>>>> the module, but the error still occurs:
>>>>>>>
>>>>>>> ```
>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>> 0..13 2 2 0 0
>>>>>>> 14..23 0 0 0 0
>>>>>>> 24..64 0 0 0 0
>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>> Command (PID=15485):
>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>> Uptime: 1.757102 s
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> = RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>>
>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> = RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>>
>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> ```
>>>>>>>
>>>>>>>
>>>>>>> and the last 100 lines:
>>>>>>>
>>>>>>> ```
>>>>>>> 000000:000002>> 11 37 pw_create_c1d
>>>>>>> start
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 37 pw_create_c1d
>>>>>>> 0.000
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 64 pw_pool_create_pw
>>>>>>> 0.000
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 25 pw_copy
>>>>>>> start Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 25 pw_copy
>>>>>>> 0.001 Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 17 pw_axpy
>>>>>>> start Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 17 pw_axpy
>>>>>>> 0.001 Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 19 mp_sum_d
>>>>>>> start Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 19 mp_sum_d
>>>>>>> 0.000 Hostmem:
>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 3 pw_poisson_solve
>>>>>>> start
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 3
>>>>>>> pw_poisson_rebuild s
>>>>>>> tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 3
>>>>>>> pw_poisson_rebuild 0
>>>>>>> .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 65
>>>>>>> pw_pool_create_pw st
>>>>>>> art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 38
>>>>>>> pw_create_c1d sta
>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 38
>>>>>>> pw_create_c1d 0.0
>>>>>>> 00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 65
>>>>>>> pw_pool_create_pw 0.
>>>>>>> 000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 26 pw_copy
>>>>>>> start Hostme
>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 26 pw_copy
>>>>>>> 0.001 Hostme
>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 3
>>>>>>> pw_multiply_with sta
>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 3
>>>>>>> pw_multiply_with 0.0
>>>>>>> 01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 27 pw_copy
>>>>>>> start Hostme
>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 27 pw_copy
>>>>>>> 0.001 Hostme
>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 3 pw_integral_ab
>>>>>>> start
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 20 mp_sum_d
>>>>>>> start Ho
>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 20 mp_sum_d
>>>>>>> 0.001 Ho
>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 3 pw_integral_ab
>>>>>>> 0.004
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 4 pw_poisson_set
>>>>>>> start
>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 66
>>>>>>> pw_pool_create_pw
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 39
>>>>>>> pw_create_c1d
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 39
>>>>>>> pw_create_c1d
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 66
>>>>>>> pw_pool_create_pw
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 28 pw_copy
>>>>>>> start Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 28 pw_copy
>>>>>>> 0.001 Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 7 pw_derive
>>>>>>> start H
>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 7 pw_derive
>>>>>>> 0.002 H
>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 67
>>>>>>> pw_pool_create_pw
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 40
>>>>>>> pw_create_c1d
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 40
>>>>>>> pw_create_c1d
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 67
>>>>>>> pw_pool_create_pw
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 29 pw_copy
>>>>>>> start Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 29 pw_copy
>>>>>>> 0.001 Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 8 pw_derive
>>>>>>> start H
>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 8 pw_derive
>>>>>>> 0.002 H
>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 68
>>>>>>> pw_pool_create_pw
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 41
>>>>>>> pw_create_c1d
>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 41
>>>>>>> pw_create_c1d
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 68
>>>>>>> pw_pool_create_pw
>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 30 pw_copy
>>>>>>> start Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 30 pw_copy
>>>>>>> 0.001 Hos
>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 9 pw_derive
>>>>>>> start H
>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>> ```
>>>>>>>
>>>>>>> This is the list of currently loaded modules (all come with intel):
>>>>>>>
>>>>>>> ```
>>>>>>> Currently Loaded Modulefiles:
>>>>>>> 1) GCCcore/13.3.0 7)
>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>>>
>>>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>>>
>>>>>>> 4) intel-compilers/2024.2.0 10)
>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>>>
>>>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>>>> ```
>>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> Dear Bartosz,
>>>>>>>> I am currently running some tests with the latest Intel compiler
>>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why
>>>>>>>> is it loaded? Can you unload it? This would at least reduce potential
>>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>>> Best,
>>>>>>>> Frederick
>>>>>>>>
>>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>>>>>
>>>>>>>>> The error for ssmp is:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>> 0..13 4 4 0 0
>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>>> Command (PID=54845):
>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>> Uptime: 2.861583 s
>>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36:
>>>>>>>>> 54845 Segmentation fault (core dumped)
>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> and the last 100 lines of output:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> 000000:000001>> 12 20 mp_sum_d
>>>>>>>>> start Ho
>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 20 mp_sum_d
>>>>>>>>> 0.000 Ho
>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 11 13 dbcsr_dot_sd
>>>>>>>>> 0.000 H
>>>>>>>>> ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 10 12
>>>>>>>>> calculate_ptrace_kp 0.0
>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 9 6
>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 9 6 rebuild_ks_matrix
>>>>>>>>> start Ho
>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 10 6
>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 11 140
>>>>>>>>> pw_pool_create_pw st
>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 79
>>>>>>>>> pw_create_c1d sta
>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 79
>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 11 140
>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 11 141
>>>>>>>>> pw_pool_create_pw st
>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 80
>>>>>>>>> pw_create_c1d sta
>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 80
>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 11 141
>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 11 61 pw_copy
>>>>>>>>> start Hostme
>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 11 61 pw_copy
>>>>>>>>> 0.004 Hostme
>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 11 35 pw_axpy
>>>>>>>>> start Hostme
>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 11 35 pw_axpy
>>>>>>>>> 0.002 Hostme
>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 11 6
>>>>>>>>> pw_poisson_solve sta
>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>> pw_poisson_rebuild
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>> pw_poisson_rebuild
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 142
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 81
>>>>>>>>> pw_create_c1d
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 81
>>>>>>>>> pw_create_c1d
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 142
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 62 pw_copy
>>>>>>>>> start Hos
>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 62 pw_copy
>>>>>>>>> 0.003 Hos
>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>> pw_multiply_with
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>> pw_multiply_with
>>>>>>>>> 0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 63 pw_copy
>>>>>>>>> start Hos
>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 63 pw_copy
>>>>>>>>> 0.003 Hos
>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>> pw_integral_ab st
>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>> pw_integral_ab 0.
>>>>>>>>> 005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 12 7
>>>>>>>>> pw_poisson_set st
>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 143
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 14 82
>>>>>>>>> pw_create_c1d
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 14 82
>>>>>>>>> pw_create_c1d
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 143
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 64
>>>>>>>>> pw_copy start
>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 64
>>>>>>>>> pw_copy 0.003
>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 16
>>>>>>>>> pw_derive star
>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 16
>>>>>>>>> pw_derive 0.00
>>>>>>>>> 6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 144
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 14 83
>>>>>>>>> pw_create_c1d
>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 14 83
>>>>>>>>> pw_create_c1d
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 144
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 65
>>>>>>>>> pw_copy start
>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001<< 13 65
>>>>>>>>> pw_copy 0.004
>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> 000000:000001>> 13 17
>>>>>>>>> pw_derive star
>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> 000000:000002<< 9 7
>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 9 7 rebuild_ks_matrix
>>>>>>>>> start Ho
>>>>>>>>>
>>>>>>>>> stmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 10 7
>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 164
>>>>>>>>> pw_pool_create_pw st
>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 93
>>>>>>>>> pw_create_c1d sta
>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 93
>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 11 164
>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 165
>>>>>>>>> pw_pool_create_pw st
>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 94
>>>>>>>>> pw_create_c1d sta
>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 94
>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 11 165
>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 73 pw_copy
>>>>>>>>> start Hostme
>>>>>>>>>
>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 11 73 pw_copy
>>>>>>>>> 0.001 Hostme
>>>>>>>>>
>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 41 pw_axpy
>>>>>>>>> start Hostme
>>>>>>>>>
>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 11 41 pw_axpy
>>>>>>>>> 0.001 Hostme
>>>>>>>>>
>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 52 mp_sum_d
>>>>>>>>> start Hostm
>>>>>>>>>
>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 11 52 mp_sum_d
>>>>>>>>> 0.000 Hostm
>>>>>>>>>
>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 11 7
>>>>>>>>> pw_poisson_solve sta
>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>> pw_poisson_rebuild
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>> pw_poisson_rebuild
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 166
>>>>>>>>> pw_pool_create_pw
>>>>>>>>>
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 95
>>>>>>>>> pw_create_c1d
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 95
>>>>>>>>> pw_create_c1d
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 166
>>>>>>>>> pw_pool_create_pw
>>>>>>>>>
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 74 pw_copy
>>>>>>>>> start Hos
>>>>>>>>>
>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 74 pw_copy
>>>>>>>>> 0.001 Hos
>>>>>>>>>
>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>> pw_multiply_with
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>> pw_multiply_with
>>>>>>>>> 0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 75 pw_copy
>>>>>>>>> start Hos
>>>>>>>>>
>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 75 pw_copy
>>>>>>>>> 0.001 Hos
>>>>>>>>>
>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>> pw_integral_ab st
>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 53
>>>>>>>>> mp_sum_d start
>>>>>>>>>
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 53
>>>>>>>>> mp_sum_d 0.000
>>>>>>>>>
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>> pw_integral_ab 0.
>>>>>>>>> 003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 12 8
>>>>>>>>> pw_poisson_set st
>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 167
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 14 96
>>>>>>>>> pw_create_c1d
>>>>>>>>>
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 14 96
>>>>>>>>> pw_create_c1d
>>>>>>>>>
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 167
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 76
>>>>>>>>> pw_copy start
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>> pw_copy 0.001
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>> pw_derive star
>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>> pw_derive 0.00
>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 14 97
>>>>>>>>> pw_create_c1d
>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 14 97
>>>>>>>>> pw_create_c1d
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>> pw_pool_create_pw
>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>> pw_copy start
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>> pw_copy 0.001
>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>> pw_derive star
>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Bartosz
>>>>>>>>>
>>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick
>>>>>>>>> Stein napisał(a):
>>>>>>>>>
>>>>>>>>>> Dear Bartosz,
>>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>>> Regarding the trace, I do not know either as there is not much
>>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the
>>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually
>>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last
>>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces
>>>>>>>>>> with the psmp version.
>>>>>>>>>> Best,
>>>>>>>>>> Frederick
>>>>>>>>>>
>>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15
>>>>>>>>>> UTC+2:
>>>>>>>>>>
>>>>>>>>>>> The error is:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>
>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>> Command (PID=2607388):
>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>> = RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>>
>>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>> = RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>>> pw_derive star
>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002>> 14 97
>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002<< 14 97
>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>>> pw_copy start
>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>>> pw_derive star
>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein
>>>>>>>>>>> napisał(a):
>>>>>>>>>>>
>>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE
>>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This
>>>>>>>>>>>> increases the size of the output file dramatically (to some million lines).
>>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40
>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I
>>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for
>>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp
>>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same
>>>>>>>>>>>>> setting, I provide example output as attachment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>
>>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein
>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1
>>>>>>>>>>>>>> (add '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of
>>>>>>>>>>>>>> the ssmp?
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um
>>>>>>>>>>>>>> 15:37:43 UTC+2:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks again for help. So I have tested different simulation
>>>>>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI
>>>>>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of
>>>>>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart
>>>>>>>>>>>>>>> from the effect on simulation time, they have no significant effect on the
>>>>>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed,
>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130;
>>>>>>>>>>>>>>> 495min
>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94;
>>>>>>>>>>>>>>> 484min
>>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74;
>>>>>>>>>>>>>>> 563min
>>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74;
>>>>>>>>>>>>>>> 556min
>>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121;
>>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227;
>>>>>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any ideas what I could do next to have more information
>>>>>>>>>>>>>>> about the source of the problem or maybe you see a potential solution at
>>>>>>>>>>>>>>> this stage? I would appreciate any further help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick
>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test
>>>>>>>>>>>>>>>> do not run that efficiently with such a large number of threads. 2 should
>>>>>>>>>>>>>>>> be sufficient.
>>>>>>>>>>>>>>>> The test result suggests that most of the functionality may
>>>>>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to
>>>>>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests
>>>>>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um
>>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/5473442a-c035-4d51-833f-4c340767ee66n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241025/ee0a2357/attachment-0001.htm>
More information about the CP2K-user
mailing list