[CP2K-user] [CP2K:20905] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types
bartosz mazur
bamaz.97 at gmail.com
Wed Nov 20 16:01:28 UTC 2024
As for the available disk space, I checked it and at the time there was
enough (about 10 GB of free space), so I do not understand where the error
came from. As for RAM, for the last task the maximum usage was about 50 GB,
and there was 2x180 GB allocated.
```
> sacct -j 3164366 --format=JobID,MaxRSS,AveRSS,MaxVMSize,AveVMSize
--units=GB
JobID MaxRSS AveRSS MaxVMSize AveVMSize
------------ ---------- ---------- ---------- ----------
3164366
3164366.bat+ 0.01G 0.01G 0.01G 0.01G
3164366.ext+ 0.00G 0.00G 0.00G 0.00G
3164366.0 47.23G 47.06G 48.05G 47.71G
```
Here you can see how memory usage was changing with
time: https://hpc-info.kdm.wcss.pl/goto/dz19rInNR?orgId=1.
In the attachment I provide all input files and output is under link
(because of size limits): https://we.tl/t-g7ObcwaNXn.
I'm not sure about output files for each rank, are they created by default?
Best
Bartosz
środa, 20 listopada 2024 o 16:28:02 UTC+1 Frederick Stein napisał(a):
> Dear Bartosz,
> Without actual CP2K input or output files, I can only guess. The first
> Slurm output states "No space left on device", the others "Cannot allocate
> memory". This suggests that there is either not enough memory on the
> harddrive available (Do you have any additional CP2K output files from each
> respective rank?). The others that you do not have enough RAM available.
> You can try to run CP2K with less MPI ranks and more OpenMP ranks. This
> reduces the number of additional temporary output files and reduces the
> memory footprint in RAM but increases the the runtime.
> Best,
> Frederick
>
> bartosz mazur schrieb am Mittwoch, 20. November 2024 um 16:01:01 UTC+1:
>
>> Hi Frederic,
>>
>> I am writing this as a follow up to previous discussions. I am currently
>> seeing a recurring problem with CP2K, where tasks are being killed after
>> about 10 days with errors as in the attached outputs. This is not
>> particularly annoying, as a restart is sufficient and the simulation can
>> run on. Unfortunately, I don't think you will be able to reproduce this
>> error, given the very long simulation time. However, if there is anything
>> else I can provide to help understand the source of these problems, let me
>> know.
>>
>> Best
>> Bartosz
>>
>> poniedziałek, 28 października 2024 o 09:34:45 UTC+1 bartosz mazur
>> napisał(a):
>>
>>> Many thanks Frederick for your help!
>>>
>>> piątek, 25 października 2024 o 14:27:36 UTC+2 Frederick Stein napisał(a):
>>>
>>>> Regarding the other issues:
>>>> I can confirm them but cannot provide fixes for all of them because the
>>>> probably trigger bugs in ifort. Because ifort is already deprecated, these
>>>> bugs will probably not be fixed. Furthermore, we do not see any issues on
>>>> our Intel CI. I will fix what I can but some of them will be left as we
>>>> will focus our efforts on the support of the new ifx compiler.
>>>>
>>>> Frederick Stein schrieb am Freitag, 25. Oktober 2024 um 11:46:00 UTC+2:
>>>>
>>>>> Dear Bartosz,
>>>>> I will check the other issues with your regtests.
>>>>> Regarding your latest issue, please provide more information such as
>>>>> an output file or a hint on the context. If I am supposed to retry the
>>>>> calculation on my local machine, I need all additional input files such as
>>>>> your plumed file. I can run your input file up to the point that CP2K needs
>>>>> plumed.
>>>>> Best,
>>>>> Frederick
>>>>> bartosz mazur schrieb am Freitag, 25. Oktober 2024 um 10:15:19 UTC+2:
>>>>>
>>>>>> I just got another error with LibXSMM, now in my regular simulation
>>>>>> and without using OpenMP. This is the error:
>>>>>>
>>>>>> ```
>>>>>> [1729843139.920274] [r23c01b04:2913 :0] ib_md.c:295 UCX
>>>>>> ERROR ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed:
>>>>>> Cannot allocate memory
>>>>>> [1729843139.920290] [r23c01b04:2913 :0] ucp_mm.c:70 UCX
>>>>>> ERROR failed to register address 0x14f0b46fc080 (host) length 7424 on
>>>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>>>
>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647]
>>>>>> [r23c01b04:2945 :0] ib_md.c:295 UCX ERROR
>>>>>> ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot
>>>>>> allocate memory
>>>>>> [1729843139.932660] [r23c01b04:2945 :0] ucp_mm.c:70 UCX
>>>>>> ERROR failed to register address 0x1491f069e040 (host) length 8128 on
>>>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>>>
>>>>>>
>>>>>> CLX/DP TRY JIT STA COL
>>>>>> 0..13 4 4 0 0
>>>>>> 14..23 4 4 0 0
>>>>>>
>>>>>> 24..64 0 0 0 0
>>>>>> Registry and code: 13 MB + 80 KB (gemm=8)
>>>>>> Command (PID=2913):
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>> cp2k.inp -o cp2k.out
>>>>>> Uptime: 407633.177169 s
>>>>>> ```
>>>>>>
>>>>>> and this is simulation input I'm using:
>>>>>>
>>>>>> ```
>>>>>> &GLOBAL
>>>>>> PROJECT uam1o_npt_rms
>>>>>> RUN_TYPE MD
>>>>>> PRINT_LEVEL LOW
>>>>>> PREFERRED_DIAG_LIBRARY SCALAPACK
>>>>>> &END GLOBAL
>>>>>>
>>>>>> &FORCE_EVAL
>>>>>> METHOD QUICKSTEP
>>>>>> STRESS_TENSOR ANALYTICAL
>>>>>> &DFT
>>>>>> BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
>>>>>> POTENTIAL_FILE_NAME POTENTIAL_UZH
>>>>>> &MGRID
>>>>>> CUTOFF 500
>>>>>> &END MGRID
>>>>>> &XC
>>>>>> &XC_FUNCTIONAL PBE
>>>>>> &END XC_FUNCTIONAL
>>>>>> &VDW_POTENTIAL
>>>>>> POTENTIAL_TYPE PAIR_POTENTIAL
>>>>>> &PAIR_POTENTIAL
>>>>>> TYPE DFTD3(BJ)
>>>>>> PARAMETER_FILE_NAME dftd3.dat
>>>>>> REFERENCE_FUNCTIONAL PBE
>>>>>> R_CUTOFF 25.0
>>>>>> &END PAIR_POTENTIAL
>>>>>> &END VDW_POTENTIAL
>>>>>> &END XC
>>>>>> &END DFT
>>>>>>
>>>>>> &SUBSYS
>>>>>> &CELL
>>>>>> A 12.2807999 0.0000000 0.0000000
>>>>>> B 7.6258602 9.6257200 0.0000000
>>>>>> C -2.1557724 -1.0420258 18.0042801
>>>>>> &END CELL
>>>>>> &COORD
>>>>>> Zn 11.37811 4.60286 0.24515
>>>>>> Zn 8.15435 3.05288 8.74518
>>>>>> Zn 6.37590 3.97311 17.74650
>>>>>> Zn 9.59842 5.54014 9.24747
>>>>>> S 11.79344 6.72692 17.10850
>>>>>> S 4.06825 3.00573 9.90358
>>>>>> S 5.95830 1.84422 0.90027
>>>>>> S 13.67407 5.58944 8.10767
>>>>>> O 10.72408 3.58291 1.89315
>>>>>> O 8.51986 4.01962 1.53085
>>>>>> O 6.60135 3.91587 7.68572
>>>>>> O 7.74637 5.79259 8.21600
>>>>>> O 15.32810 8.58246 5.10041
>>>>>> O 9.35608 2.93551 7.09500
>>>>>> O 10.38999 4.93007 7.45977
>>>>>> O 11.66491 6.35111 1.31266
>>>>>> O 9.48582 6.62478 0.77364
>>>>>> O 2.59062 2.40094 3.91496
>>>>>> O 7.03031 4.99173 16.09885
>>>>>> O 9.23544 4.56122 16.46252
>>>>>> O 11.14602 4.67776 10.31440
>>>>>> O 10.00982 2.79915 9.77218
>>>>>> O 2.41388 0.01898 12.91899
>>>>>> O 8.39375 5.66143 10.89628
>>>>>> O 7.36998 3.66087 10.53589
>>>>>> O 6.08863 2.22161 16.68336
>>>>>> O 8.26988 1.95313 17.21650
>>>>>> O 15.16937 6.16381 14.09906
>>>>>> N 13.25907 3.80728 0.04001
>>>>>> N 2.36335 -0.74130 17.33402
>>>>>> N 7.60676 1.08576 8.95623
>>>>>> N 15.77729 5.75974 9.67861
>>>>>> N 4.49430 4.76652 17.95756
>>>>>> N 15.38873 9.31230 0.67467
>>>>>> N 10.14308 7.50848 9.04236
>>>>>> N 1.96529 2.83557 8.33233
>>>>>> C 6.76554 5.18292 7.68414
>>>>>> C 14.28210 4.11624 0.86006
>>>>>> C 9.47998 3.39622 2.09658
>>>>>> C 3.20112 3.42080 0.84626
>>>>>> C 9.91466 1.18589 3.17244
>>>>>> C 9.08210 2.29987 3.02657
>>>>>> C 5.74710 6.04945 7.01821
>>>>>> C 7.83265 2.30920 3.66005
>>>>>> C 3.35793 2.34328 -0.04029
>>>>>> C 4.51663 1.46385 -0.02755
>>>>>> C 16.24194 7.75266 5.73606
>>>>>> C 4.78940 5.52817 6.14198
>>>>>> C 7.40810 1.21174 4.39947
>>>>>> C 16.18016 6.38244 5.49010
>>>>>> C 9.48869 0.06986 3.88005
>>>>>> C 11.27238 1.77457 17.14330
>>>>>> C 5.77166 7.43009 7.27236
>>>>>> C 11.14819 8.24901 17.58588
>>>>>> C 8.22170 0.08058 4.47135
>>>>>> C 0.15087 1.02286 17.07544
>>>>>> C 17.16180 8.28565 6.64351
>>>>>> C 10.57067 7.01060 1.31282
>>>>>> C 6.72654 0.47459 8.14002
>>>>>> C 10.27972 3.79035 6.89470
>>>>>> C 14.15006 8.72843 8.15880
>>>>>> C 11.73751 2.06868 5.82537
>>>>>> C 11.38838 3.41515 5.96966
>>>>>> C 10.52304 8.34339 1.98566
>>>>>> C 12.16584 4.39562 5.33967
>>>>>> C 14.89762 7.93801 9.04648
>>>>>> C 14.86698 6.48365 9.03575
>>>>>> C 2.67167 1.17044 3.27681
>>>>>> C 11.52468 8.76552 2.86608
>>>>>> C 13.29140 4.04007 4.60622
>>>>>> C 3.78230 0.36534 3.52266
>>>>>> C 12.87823 1.70260 5.12344
>>>>>> C 8.27761 0.34001 9.85941
>>>>>> C 9.42677 9.18364 1.73295
>>>>>> C 3.27553 4.45658 9.42657
>>>>>> C 13.66559 2.69775 4.53650
>>>>>> C 15.77023 8.59069 9.93240
>>>>>> C 1.68356 0.78491 2.36643
>>>>>> C 10.98451 3.41041 10.31327
>>>>>> C 3.46873 4.45681 17.14097
>>>>>> C 8.27403 5.18373 15.89814
>>>>>> C 14.54907 5.15099 17.15930
>>>>>> C 7.83119 7.39584 14.82858
>>>>>> C 8.66916 6.28563 14.97331
>>>>>> C 11.99928 2.54577 10.98702
>>>>>> C 9.92072 6.28547 14.34388
>>>>>> C 16.54982 7.26986 0.04271
>>>>>> C 15.39103 8.14919 0.03189
>>>>>> C 1.50023 0.84646 12.27989
>>>>>> C 12.95126 3.06908 11.86817
>>>>>> C 10.34198 7.38826 13.61070
>>>>>> C 1.55836 2.21699 12.52561
>>>>>> C 8.25354 8.51697 14.12666
>>>>>> C 6.48249 6.79770 0.85630
>>>>>> C 11.97760 1.16465 10.73446
>>>>>> C 6.60385 0.32218 0.42301
>>>>>> C 9.52282 8.51550 13.54043
>>>>>> C 17.60321 7.54791 0.92891
>>>>>> C 0.58530 0.31102 11.36884
>>>>>> C 7.18362 1.56332 16.68291
>>>>>> C 11.01926 8.11905 9.86341
>>>>>> C 7.47582 4.80132 11.10039
>>>>>> C 3.59282 -0.13430 9.84955
>>>>>> C 6.01179 6.51430 12.17471
>>>>>> C 6.36853 5.17005 12.02942
>>>>>> C 7.23131 0.22715 16.01652
>>>>>> C 5.59963 4.18477 12.66234
>>>>>> C 2.84614 0.65728 8.96213
>>>>>> C 2.87561 2.11161 8.97508
>>>>>> C 15.08536 7.39548 14.73440
>>>>>> C 6.23001 -0.19920 15.13769
>>>>>> C 4.47482 4.53325 13.40042
>>>>>> C 13.97400 8.19851 14.48576
>>>>>> C 4.87173 6.87322 12.88120
>>>>>> C 9.47231 8.25578 8.14046
>>>>>> C 8.32790 -0.61137 16.27301
>>>>>> C 14.46698 4.13864 8.58475
>>>>>> C 4.09294 5.87331 13.47165
>>>>>> C 1.97640 0.00563 8.07267
>>>>>> C 16.07240 7.78504 15.64417
>>>>>> H 14.10215 4.93465 1.55678
>>>>>> H 3.98110 3.68721 1.55899
>>>>>> H 10.89072 1.19647 2.69205
>>>>>> H 7.19958 3.19021 3.56839
>>>>>> H 4.75923 4.45384 5.96230
>>>>>> H 6.45299 1.21835 4.92062
>>>>>> H 15.44211 6.00062 4.78824
>>>>>> H 17.75043 8.81610 3.97156
>>>>>> H 10.41563 1.57993 16.49923
>>>>>> H 6.49332 7.81303 7.99143
>>>>>> H 0.24800 0.19739 16.37425
>>>>>> H 9.53586 -0.26872 6.84508
>>>>>> H 6.19685 1.12218 7.44173
>>>>>> H 13.45550 8.28133 7.44815
>>>>>> H 11.11633 1.31384 6.30260
>>>>>> H 11.87413 5.44074 5.42962
>>>>>> H 12.38442 8.12016 3.04474
>>>>>> H 13.88694 4.78876 4.08791
>>>>>> H 4.53915 0.70283 4.22717
>>>>>> H 0.88557 0.65625 5.03328
>>>>>> H 8.96418 0.89159 10.50060
>>>>>> H 8.67994 8.85961 1.01083
>>>>>> H 16.35704 8.00331 10.63471
>>>>>> H 13.12606 1.45212 2.16563
>>>>>> H 3.64702 3.63930 16.44281
>>>>>> H 13.76743 4.88477 16.44833
>>>>>> H 6.85355 7.37827 15.30535
>>>>>> H 10.55820 5.40745 14.43410
>>>>>> H 12.97886 4.14375 12.04672
>>>>>> H 11.29905 7.38966 13.09313
>>>>>> H 2.29216 2.60091 13.23073
>>>>>> H -0.01303 -0.23279 14.03603
>>>>>> H 7.34113 6.99275 1.49776
>>>>>> H 11.26049 0.78023 10.01184
>>>>>> H 17.50743 8.37258 1.63130
>>>>>> H 8.21398 8.86531 11.16822
>>>>>> H 11.54834 7.47018 10.56097
>>>>>> H 4.28503 0.31205 10.56295
>>>>>> H 6.62643 7.27289 11.69479
>>>>>> H 5.89748 3.14154 12.57118
>>>>>> H 5.36986 0.44461 14.95599
>>>>>> H 3.88656 3.78035 13.92095
>>>>>> H 13.21826 7.85764 13.78163
>>>>>> H 16.85773 7.91771 12.97237
>>>>>> H 8.78884 7.70469 7.49554
>>>>>> H 9.07452 -0.28399 16.99402
>>>>>> H 1.39009 0.59398 7.37083
>>>>>> H 4.63062 7.11938 15.84758
>>>>>> &END COORD
>>>>>> &KIND Zn
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
>>>>>> POTENTIAL GTH-PBE-q12
>>>>>> &END KIND
>>>>>> &KIND S
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>>>> POTENTIAL GTH-PBE-q6
>>>>>> &END KIND
>>>>>> &KIND O
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>>>> POTENTIAL GTH-PBE-q6
>>>>>> &END KIND
>>>>>> &KIND N
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
>>>>>> POTENTIAL GTH-PBE-q5
>>>>>> &END KIND
>>>>>> &KIND C
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
>>>>>> POTENTIAL GTH-PBE-q4
>>>>>> &END KIND
>>>>>> &KIND H
>>>>>> BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
>>>>>> POTENTIAL GTH-PBE-q1
>>>>>> &END KIND
>>>>>> &END SUBSYS
>>>>>> &END FORCE_EVAL
>>>>>>
>>>>>> &MOTION
>>>>>> &MD
>>>>>> ENSEMBLE NPT_I
>>>>>> TEMPERATURE 298
>>>>>> TIMESTEP 1.0
>>>>>> STEPS 50000
>>>>>> &THERMOSTAT
>>>>>> TYPE NOSE
>>>>>> &NOSE
>>>>>> LENGTH 3
>>>>>> YOSHIDA 3
>>>>>> TIMECON 1000
>>>>>> &END NOSE
>>>>>> &END THERMOSTAT
>>>>>> &BAROSTAT
>>>>>> PRESSURE 1.0
>>>>>> TIMECON 4000
>>>>>> &END BAROSTAT
>>>>>> &END MD
>>>>>> &FREE_ENERGY
>>>>>> METHOD METADYN
>>>>>> &METADYN
>>>>>> USE_PLUMED .TRUE.
>>>>>> PLUMED_INPUT_FILE plumed.dat
>>>>>> &END METADYN
>>>>>> &END FREE_ENERGY
>>>>>> &PRINT
>>>>>> &TRAJECTORY
>>>>>> &EACH
>>>>>> MD 5
>>>>>> &END EACH
>>>>>> &END TRAJECTORY
>>>>>> &FORCES
>>>>>> UNIT eV*angstrom^-1
>>>>>> &EACH
>>>>>> MD 5
>>>>>> &END EACH
>>>>>> &END FORCES
>>>>>> &CELL
>>>>>> &EACH
>>>>>> MD 5
>>>>>> &END EACH
>>>>>> &END CELL
>>>>>> &END PRINT
>>>>>> &END MOTION
>>>>>> ```
>>>>>>
>>>>>> This simulation was performed with previous version of cp2k (so
>>>>>> without your fix).
>>>>>> piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur
>>>>>> napisał(a):
>>>>>>
>>>>>>> Hi Frederick,
>>>>>>>
>>>>>>> it helped with most of the tests! Now only 13 have failed. In the
>>>>>>> attachments you will find full output from regtests and here is output from
>>>>>>> single job with TRACE enabled:
>>>>>>>
>>>>>>> ```
>>>>>>> Loading intel/2024a
>>>>>>> Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>>>>>>> binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>>>>>>> numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>>
>>>>>>> Currently Loaded Modulefiles:
>>>>>>> 1) GCCcore/13.3.0 7)
>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>>>
>>>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>>>
>>>>>>> 4) intel-compilers/2024.2.0 10)
>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>>>
>>>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>>>> 2 MPI processes with 2 OpenMP threads each
>>>>>>> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
>>>>>>> SIRIUS 7.6.1, git hash:
>>>>>>> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
>>>>>>> Warning! Compiled in 'debug' mode with assert statements enabled!
>>>>>>>
>>>>>>>
>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>> 0..13 8 8 0 0
>>>>>>> 14..23 0 0 0 0
>>>>>>> 24..64 0 0 0 0
>>>>>>> Registry and code: 13 MB + 64 KB (gemm=8)
>>>>>>> Command (PID=423503):
>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>> dftd3src1.inp -o dftd3src1.out
>>>>>>> Uptime: 2.752513 s
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> = RANK 0 PID 423503 RUNNING AT r21c01b03
>>>>>>>
>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> = RANK 1 PID 423504 RUNNING AT r21c01b03
>>>>>>>
>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> finished at Fri Oct 25 09:34:39 CEST 2024
>>>>>>> ```
>>>>>>>
>>>>>>> and the last lines:
>>>>>>>
>>>>>>> ```
>>>>>>> 000000:000002<< 13 3
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 4
>>>>>>> mp_sendrecv_dm2
>>>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 4
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 2
>>>>>>> pw_nn_compose_r 0
>>>>>>> .003 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 1 xc_pw_derive
>>>>>>> 0.003 H
>>>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 5 pw_zero
>>>>>>> start Hostme
>>>>>>> m: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 5 pw_zero
>>>>>>> 0.000 Hostme
>>>>>>> m: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 2 xc_pw_derive
>>>>>>> start H
>>>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 3
>>>>>>> pw_nn_compose_r s
>>>>>>> tart Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 5
>>>>>>> mp_sendrecv_dm2
>>>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 5
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 6
>>>>>>> mp_sendrecv_dm2
>>>>>>> start Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 6
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 3
>>>>>>> pw_nn_compose_r 0
>>>>>>> .002 Hostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 2 xc_pw_derive
>>>>>>> 0.002 H
>>>>>>> ostmem: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 6 pw_zero
>>>>>>> start Hostme
>>>>>>> m: 955 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 6 pw_zero
>>>>>>> 0.001 Hostme
>>>>>>> m: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 3 xc_pw_derive
>>>>>>> start H
>>>>>>> ostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 4
>>>>>>> pw_nn_compose_r s
>>>>>>> tart Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 7
>>>>>>> mp_sendrecv_dm2
>>>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 7
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 13 8
>>>>>>> mp_sendrecv_dm2
>>>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 13 8
>>>>>>> mp_sendrecv_dm2
>>>>>>> 0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 4
>>>>>>> pw_nn_compose_r 0
>>>>>>> .002 Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 3 xc_pw_derive
>>>>>>> 0.002 H
>>>>>>> ostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 1
>>>>>>> pw_spline_scale_deriv
>>>>>>> start Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 1
>>>>>>> pw_spline_scale_deriv
>>>>>>> 0.001 Hostmem: 960 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 20
>>>>>>> pw_pool_give_back_pw
>>>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 20
>>>>>>> pw_pool_give_back_pw
>>>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 21
>>>>>>> pw_pool_give_back_pw
>>>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 21
>>>>>>> pw_pool_give_back_pw
>>>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 22
>>>>>>> pw_pool_give_back_pw
>>>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 22
>>>>>>> pw_pool_give_back_pw
>>>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 23
>>>>>>> pw_pool_give_back_pw
>>>>>>> start Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 23
>>>>>>> pw_pool_give_back_pw
>>>>>>> 0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 11 1
>>>>>>> xc_functional_eval s
>>>>>>> tart Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 12 1
>>>>>>> b97_lda_eval star
>>>>>>> t Hostmem: 965 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 12 1
>>>>>>> b97_lda_eval 0.10
>>>>>>> 3 Hostmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 11 1
>>>>>>> xc_functional_eval 0
>>>>>>> .103 Hostmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 1
>>>>>>> xc_rho_set_and_dset_create
>>>>>>> 0.120 Hostmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 1
>>>>>>> check_for_derivatives s
>>>>>>> tart Hostmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 1
>>>>>>> check_for_derivatives 0
>>>>>>> .000 Hostmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 14 pw_create_r3d
>>>>>>> start Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 14 pw_create_r3d
>>>>>>> 0.000 Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 15 pw_create_r3d
>>>>>>> start Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 15 pw_create_r3d
>>>>>>> 0.000 Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 16 pw_create_r3d
>>>>>>> start Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 16 pw_create_r3d
>>>>>>> 0.000 Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002>> 10 17 pw_create_r3d
>>>>>>> start Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> 000000:000002<< 10 17 pw_create_r3d
>>>>>>> 0.000 Hos
>>>>>>> tmem: 979 MB GPUmem: 0 MB
>>>>>>> ```
>>>>>>>
>>>>>>> Best
>>>>>>> Bartosz
>>>>>>>
>>>>>>> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> Dear Bartosz,
>>>>>>>> My fix is merged. Can you switch to the CP2K master and try it
>>>>>>>> again? We are still working on a few issues with the Intel compilers such
>>>>>>>> that we may eventually migrate from ifort to ifx.
>>>>>>>> Best,
>>>>>>>> Frederick
>>>>>>>>
>>>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21
>>>>>>>> UTC+2:
>>>>>>>>
>>>>>>>>> Great! Thank you for your help.
>>>>>>>>>
>>>>>>>>> Best
>>>>>>>>> Bartosz
>>>>>>>>>
>>>>>>>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein
>>>>>>>>> napisał(a):
>>>>>>>>>
>>>>>>>>>> I have a fix for it. In contrast to my first thought, it is a
>>>>>>>>>> case of invalid type conversion from real to complex numbers (yes, Fortran
>>>>>>>>>> is rather strict about it) in pw_derive. This may also be present in a few
>>>>>>>>>> other spots. I am currently running more tests and I will open a pull
>>>>>>>>>> request within the next few days.
>>>>>>>>>> Best,
>>>>>>>>>> Frederick
>>>>>>>>>>
>>>>>>>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49
>>>>>>>>>> UTC+2:
>>>>>>>>>>
>>>>>>>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>>>>>>>
>>>>>>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> I was loading it as it was needed for compilation. I have
>>>>>>>>>>>> unloaded the module, but the error still occurs:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>>> Command (PID=15485):
>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>> Uptime: 1.757102 s
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>> = RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>>>>>>>
>>>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>> = RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>>>>>>>
>>>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> and the last 100 lines:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> 000000:000002>> 11 37
>>>>>>>>>>>> pw_create_c1d start
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 37
>>>>>>>>>>>> pw_create_c1d 0.000
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 10 64
>>>>>>>>>>>> pw_pool_create_pw 0.000
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 10 25 pw_copy
>>>>>>>>>>>> start Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 10 25 pw_copy
>>>>>>>>>>>> 0.001 Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 10 17 pw_axpy
>>>>>>>>>>>> start Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 10 17 pw_axpy
>>>>>>>>>>>> 0.001 Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 10 19 mp_sum_d
>>>>>>>>>>>> start Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 10 19 mp_sum_d
>>>>>>>>>>>> 0.000 Hostmem:
>>>>>>>>>>>> 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 10 3
>>>>>>>>>>>> pw_poisson_solve start
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>>>> pw_poisson_rebuild s
>>>>>>>>>>>> tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>>>> pw_poisson_rebuild 0
>>>>>>>>>>>> .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 65
>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>> art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 38
>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 38
>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>> 00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 65
>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>> 000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 26 pw_copy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 26 pw_copy
>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>>>> pw_multiply_with sta
>>>>>>>>>>>> rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>>>> pw_multiply_with 0.0
>>>>>>>>>>>> 01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 27 pw_copy
>>>>>>>>>>>> start Hostme
>>>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 27 pw_copy
>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>> m: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 3
>>>>>>>>>>>> pw_integral_ab start
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 20
>>>>>>>>>>>> mp_sum_d start Ho
>>>>>>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 20
>>>>>>>>>>>> mp_sum_d 0.001 Ho
>>>>>>>>>>>> stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 11 3
>>>>>>>>>>>> pw_integral_ab 0.004
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 11 4
>>>>>>>>>>>> pw_poisson_set start
>>>>>>>>>>>> Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 66
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 39
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 39
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 66
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 28
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 28
>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>> pw_derive start H
>>>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>> pw_derive 0.002 H
>>>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 67
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 40
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 40
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 67
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 29
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 29
>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 8
>>>>>>>>>>>> pw_derive start H
>>>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 8
>>>>>>>>>>>> pw_derive 0.002 H
>>>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 68
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 13 41
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 13 41
>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 68
>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>> 0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 30
>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002<< 12 30
>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>> tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> 000000:000002>> 12 9
>>>>>>>>>>>> pw_derive start H
>>>>>>>>>>>> ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> This is the list of currently loaded modules (all come with
>>>>>>>>>>>> intel):
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> Currently Loaded Modulefiles:
>>>>>>>>>>>> 1) GCCcore/13.3.0 7)
>>>>>>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0
>>>>>>>>>>>> 2) zlib/1.3.1-GCCcore-13.3.0 8) imkl/2024.2.0
>>>>>>>>>>>>
>>>>>>>>>>>> 3) binutils/2.42-GCCcore-13.3.0 9) iimpi/2024a
>>>>>>>>>>>>
>>>>>>>>>>>> 4) intel-compilers/2024.2.0 10)
>>>>>>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>>>>>>>> 5) numactl/2.0.18-GCCcore-13.3.0 11) intel/2024a
>>>>>>>>>>>>
>>>>>>>>>>>> 6) UCX/1.16.0-GCCcore-13.3.0
>>>>>>>>>>>> ```
>>>>>>>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein
>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> I am currently running some tests with the latest Intel
>>>>>>>>>>>>> compiler myself. What bothers me about your setup is the module
>>>>>>>>>>>>> GCC13/13.3.0 . Why is it loaded? Can you unload it? This would at least
>>>>>>>>>>>>> reduce potential interferences with between the Intel and the GCC compilers.
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The error for ssmp is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>>>>> 0..13 4 4 0 0
>>>>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>>>>>>>> Command (PID=54845):
>>>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>>>> Uptime: 2.861583 s
>>>>>>>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36:
>>>>>>>>>>>>>> 54845 Segmentation fault (core dumped)
>>>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i
>>>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and the last 100 lines of output:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> 000000:000001>> 12 20
>>>>>>>>>>>>>> mp_sum_d start Ho
>>>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 20
>>>>>>>>>>>>>> mp_sum_d 0.000 Ho
>>>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 11 13
>>>>>>>>>>>>>> dbcsr_dot_sd 0.000 H
>>>>>>>>>>>>>> ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 10 12
>>>>>>>>>>>>>> calculate_ptrace_kp 0.0
>>>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 9 6
>>>>>>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 9 6
>>>>>>>>>>>>>> rebuild_ks_matrix start Ho
>>>>>>>>>>>>>> stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 10 6
>>>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 11 140
>>>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 79
>>>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 79
>>>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 11 140
>>>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 11 141
>>>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 80
>>>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 80
>>>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>>>> 00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 11 141
>>>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>>>> 000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 11 61 pw_copy
>>>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 11 61 pw_copy
>>>>>>>>>>>>>> 0.004 Hostme
>>>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 11 35 pw_axpy
>>>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 11 35 pw_axpy
>>>>>>>>>>>>>> 0.002 Hostme
>>>>>>>>>>>>>> m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 11 6
>>>>>>>>>>>>>> pw_poisson_solve sta
>>>>>>>>>>>>>> rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 142
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 81
>>>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 81
>>>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 142
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 62
>>>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 62
>>>>>>>>>>>>>> pw_copy 0.003 Hos
>>>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>>>> 0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 63
>>>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 63
>>>>>>>>>>>>>> pw_copy 0.003 Hos
>>>>>>>>>>>>>> tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 6
>>>>>>>>>>>>>> pw_integral_ab st
>>>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 12 6
>>>>>>>>>>>>>> pw_integral_ab 0.
>>>>>>>>>>>>>> 005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 12 7
>>>>>>>>>>>>>> pw_poisson_set st
>>>>>>>>>>>>>> art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 143
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 14
>>>>>>>>>>>>>> 82 pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 14
>>>>>>>>>>>>>> 82 pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 143
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 64
>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 64
>>>>>>>>>>>>>> pw_copy 0.003
>>>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 16
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 16
>>>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>>>> 6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 144
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 14
>>>>>>>>>>>>>> 83 pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 14
>>>>>>>>>>>>>> 83 pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 144
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 65
>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001<< 13 65
>>>>>>>>>>>>>> pw_copy 0.004
>>>>>>>>>>>>>> Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000001>> 13 17
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> 000000:000002<< 9 7
>>>>>>>>>>>>>> evaluate_core_matrix_traces
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 9 7
>>>>>>>>>>>>>> rebuild_ks_matrix start Ho
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 10 7
>>>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 164
>>>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 93
>>>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 93
>>>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 11 164
>>>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 165
>>>>>>>>>>>>>> pw_pool_create_pw st
>>>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 94
>>>>>>>>>>>>>> pw_create_c1d sta
>>>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 94
>>>>>>>>>>>>>> pw_create_c1d 0.0
>>>>>>>>>>>>>> 00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 11 165
>>>>>>>>>>>>>> pw_pool_create_pw 0.
>>>>>>>>>>>>>> 000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 73 pw_copy
>>>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 11 73 pw_copy
>>>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 41 pw_axpy
>>>>>>>>>>>>>> start Hostme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 11 41 pw_axpy
>>>>>>>>>>>>>> 0.001 Hostme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 52
>>>>>>>>>>>>>> mp_sum_d start Hostm
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 11 52
>>>>>>>>>>>>>> mp_sum_d 0.000 Hostm
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 11 7
>>>>>>>>>>>>>> pw_poisson_solve sta
>>>>>>>>>>>>>> rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>>>> pw_poisson_rebuild
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 166
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 95
>>>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 95
>>>>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 166
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 74
>>>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 74
>>>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>>>> pw_multiply_with
>>>>>>>>>>>>>> 0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 75
>>>>>>>>>>>>>> pw_copy start Hos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 75
>>>>>>>>>>>>>> pw_copy 0.001 Hos
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 7
>>>>>>>>>>>>>> pw_integral_ab st
>>>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 53
>>>>>>>>>>>>>> mp_sum_d start
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 53
>>>>>>>>>>>>>> mp_sum_d 0.000
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 12 7
>>>>>>>>>>>>>> pw_integral_ab 0.
>>>>>>>>>>>>>> 003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 12 8
>>>>>>>>>>>>>> pw_poisson_set st
>>>>>>>>>>>>>> art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 167
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 14
>>>>>>>>>>>>>> 96 pw_create_c1d
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 14
>>>>>>>>>>>>>> 96 pw_create_c1d
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 167
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 76
>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 14
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 14
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick
>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>>>>>>>> Regarding the trace, I do not know either as there is not
>>>>>>>>>>>>>>> much that could break in pw_derive (it just performs multiplications) and
>>>>>>>>>>>>>>> the sequence of operations is to unspecific. It may be that the code
>>>>>>>>>>>>>>> actually breaks somewhere else. Can you do the same with the ssmp and post
>>>>>>>>>>>>>>> the last 100 lines? This way, we remove the asynchronicity issues for
>>>>>>>>>>>>>>> backtraces with the psmp version.
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um
>>>>>>>>>>>>>>> 16:47:15 UTC+2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The error is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>>>>>> CLX/DP TRY JIT STA COL
>>>>>>>>>>>>>>>> 0..13 2 2 0 0
>>>>>>>>>>>>>>>> 14..23 0 0 0 0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 24..64 0 0 0 0
>>>>>>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>>>>>>> Command (PID=2607388):
>>>>>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i
>>>>>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>>>> = RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> = KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>>>> = RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>>>>>>>> = KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>> 000000:000002<< 13 76
>>>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002>> 13 19
>>>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002<< 13 19
>>>>>>>>>>>>>>>> pw_derive 0.00
>>>>>>>>>>>>>>>> 2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002>> 13 168
>>>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002>> 14
>>>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>>>> start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002<< 14
>>>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002<< 13 168
>>>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>>> 0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002>> 13 77
>>>>>>>>>>>>>>>> pw_copy start
>>>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002<< 13 77
>>>>>>>>>>>>>>>> pw_copy 0.001
>>>>>>>>>>>>>>>> Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> 000000:000002>> 13 20
>>>>>>>>>>>>>>>> pw_derive star
>>>>>>>>>>>>>>>> t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick
>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE
>>>>>>>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This
>>>>>>>>>>>>>>>>> increases the size of the output file dramatically (to some million lines).
>>>>>>>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um
>>>>>>>>>>>>>>>>> 17:09:40 UTC+2:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but
>>>>>>>>>>>>>>>>>> I assume it makes no difference. As I mentioned in previous message for
>>>>>>>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp
>>>>>>>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same
>>>>>>>>>>>>>>>>>> setting, I provide example output as attachment.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick
>>>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to
>>>>>>>>>>>>>>>>>>> 1 (add '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of
>>>>>>>>>>>>>>>>>>> the ssmp?
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um
>>>>>>>>>>>>>>>>>>> 15:37:43 UTC+2:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> thanks again for help. So I have tested different
>>>>>>>>>>>>>>>>>>>> simulation variants and I know that the problem occurs when using OMP. For
>>>>>>>>>>>>>>>>>>>> MPI calculations without OMP all tests pass. I have also tested the effect
>>>>>>>>>>>>>>>>>>>> of the `OMP_PROC_BIND` and `OMP_PLACES` parameters and
>>>>>>>>>>>>>>>>>>>> apart from the effect on simulation time, they have no significant effect
>>>>>>>>>>>>>>>>>>>> on the presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong,
>>>>>>>>>>>>>>>>>>>> failed, time
>>>>>>>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed:
>>>>>>>>>>>>>>>>>>>> 130; 495min
>>>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed:
>>>>>>>>>>>>>>>>>>>> 94; 484min
>>>>>>>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed:
>>>>>>>>>>>>>>>>>>>> 74; 563min
>>>>>>>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed:
>>>>>>>>>>>>>>>>>>>> 74; 556min
>>>>>>>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed:
>>>>>>>>>>>>>>>>>>>> 121; 511min
>>>>>>>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 /
>>>>>>>>>>>>>>>>>>>> 4227; failed: 98; 263min
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Any ideas what I could do next to have more information
>>>>>>>>>>>>>>>>>>>> about the source of the problem or maybe you see a potential solution at
>>>>>>>>>>>>>>>>>>>> this stage? I would appreciate any further help.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick
>>>>>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The
>>>>>>>>>>>>>>>>>>>>> test do not run that efficiently with such a large number of threads. 2
>>>>>>>>>>>>>>>>>>>>> should be sufficient.
>>>>>>>>>>>>>>>>>>>>> The test result suggests that most of the
>>>>>>>>>>>>>>>>>>>>> functionality may work but due to a missing backtrace (or similar
>>>>>>>>>>>>>>>>>>>>> information), it is hard to tell why they fail. You could also try to run
>>>>>>>>>>>>>>>>>>>>> some of the single-node tests to assess the stability of CP2K.
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um
>>>>>>>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/80aa8c4f-514e-4532-9488-d0b0eebfc42fn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/74ee809d/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run_cp2k.job
Type: application/x-shellscript
Size: 687 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/74ee809d/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plumed.dat
Type: application/octet-stream
Size: 664 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/74ee809d/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cp2k.inp
Type: chemical/x-gamess-input
Size: 10406 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/74ee809d/attachment-0001.inp>
More information about the CP2K-user
mailing list