[CP2K-user] [CP2K:20903] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types

bartosz mazur bamaz.97 at gmail.com
Wed Nov 20 14:58:36 UTC 2024


Hi Frederic, 

I am writing this as a follow up to previous discussions. I am currently 
seeing a recurring problem with CP2K, where tasks are being killed after 
about 10 days with errors as in the attached outputs. This is not 
particularly annoying, as a restart is sufficient and the simulation can 
run on. Unfortunately, I don't think you will be able to reproduce this 
error, given the very long simulation time. However, if there is anything 
else I can provide to help understand the source of these problems, let me 
know. 

Best
Bartosz

poniedziałek, 28 października 2024 o 09:34:45 UTC+1 bartosz mazur 
napisał(a):

> Many thanks Frederick for your help! 
>
> piątek, 25 października 2024 o 14:27:36 UTC+2 Frederick Stein napisał(a):
>
>> Regarding the other issues:
>> I can confirm them but cannot provide fixes for all of them because the 
>> probably trigger bugs in ifort. Because ifort is already deprecated, these 
>> bugs will probably not be fixed. Furthermore, we do not see any issues on 
>> our Intel CI. I will fix what I can but some of them will be left as we 
>> will focus our efforts on the support of the new ifx compiler.
>>
>> Frederick Stein schrieb am Freitag, 25. Oktober 2024 um 11:46:00 UTC+2:
>>
>>> Dear Bartosz, 
>>> I will check the other issues with your regtests.
>>> Regarding your latest issue, please provide more information such as an 
>>> output file or a hint on the context. If I am supposed to retry the 
>>> calculation on my local machine, I need all additional input files such as 
>>> your plumed file. I can run your input file up to the point that CP2K needs 
>>> plumed.
>>> Best,
>>> Frederick
>>> bartosz mazur schrieb am Freitag, 25. Oktober 2024 um 10:15:19 UTC+2:
>>>
>>>> I just got another error with LibXSMM, now in my regular simulation and 
>>>> without using OpenMP. This is the error:
>>>>
>>>> ```
>>>> [1729843139.920274] [r23c01b04:2913 :0]           ib_md.c:295  UCX 
>>>>  ERROR ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed: 
>>>> Cannot allocate memory
>>>> [1729843139.920290] [r23c01b04:2913 :0]          ucp_mm.c:70   UCX 
>>>>  ERROR failed to register address 0x14f0b46fc080 (host) length 7424 on 
>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>
>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647] 
>>>> [r23c01b04:2945 :0]           ib_md.c:295  UCX  ERROR 
>>>> ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot 
>>>> allocate memory
>>>> [1729843139.932660] [r23c01b04:2945 :0]          ucp_mm.c:70   UCX 
>>>>  ERROR failed to register address 0x1491f069e040 (host) length 8128 on 
>>>> md[4]=mlx5_0: Input/output error (md supports: host)
>>>>
>>>>
>>>> CLX/DP      TRY    JIT    STA    COL
>>>>    0..13      4      4      0      0
>>>>   14..23      4      4      0      0
>>>>
>>>>   24..64      0      0      0      0
>>>> Registry and code: 13 MB + 80 KB (gemm=8)
>>>> Command (PID=2913): 
>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>> cp2k.inp -o cp2k.out
>>>> Uptime: 407633.177169 s
>>>> ```
>>>>
>>>> and this is simulation input I'm using:
>>>>
>>>> ```
>>>> &GLOBAL
>>>>   PROJECT uam1o_npt_rms
>>>>   RUN_TYPE MD
>>>>   PRINT_LEVEL LOW
>>>>   PREFERRED_DIAG_LIBRARY SCALAPACK
>>>> &END GLOBAL
>>>>
>>>> &FORCE_EVAL
>>>>   METHOD QUICKSTEP
>>>>   STRESS_TENSOR ANALYTICAL
>>>>   &DFT
>>>>     BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
>>>>     POTENTIAL_FILE_NAME POTENTIAL_UZH
>>>>     &MGRID
>>>>       CUTOFF 500
>>>>     &END MGRID
>>>>     &XC
>>>>       &XC_FUNCTIONAL PBE
>>>>       &END XC_FUNCTIONAL
>>>>       &VDW_POTENTIAL
>>>>         POTENTIAL_TYPE PAIR_POTENTIAL
>>>>         &PAIR_POTENTIAL
>>>>           TYPE  DFTD3(BJ)
>>>>           PARAMETER_FILE_NAME  dftd3.dat
>>>>           REFERENCE_FUNCTIONAL PBE
>>>>           R_CUTOFF  25.0
>>>>         &END PAIR_POTENTIAL
>>>>       &END VDW_POTENTIAL
>>>>     &END XC
>>>>   &END DFT
>>>>
>>>>   &SUBSYS
>>>>     &CELL
>>>>       A      12.2807999       0.0000000       0.0000000
>>>>       B       7.6258602       9.6257200       0.0000000
>>>>       C      -2.1557724      -1.0420258      18.0042801
>>>>     &END CELL
>>>>     &COORD
>>>>       Zn      11.37811      4.60286      0.24515
>>>>       Zn       8.15435      3.05288      8.74518
>>>>       Zn       6.37590      3.97311     17.74650
>>>>       Zn       9.59842      5.54014      9.24747
>>>>       S       11.79344      6.72692     17.10850
>>>>       S        4.06825      3.00573      9.90358
>>>>       S        5.95830      1.84422      0.90027
>>>>       S       13.67407      5.58944      8.10767
>>>>       O       10.72408      3.58291      1.89315
>>>>       O        8.51986      4.01962      1.53085
>>>>       O        6.60135      3.91587      7.68572
>>>>       O        7.74637      5.79259      8.21600
>>>>       O       15.32810      8.58246      5.10041
>>>>       O        9.35608      2.93551      7.09500
>>>>       O       10.38999      4.93007      7.45977
>>>>       O       11.66491      6.35111      1.31266
>>>>       O        9.48582      6.62478      0.77364
>>>>       O        2.59062      2.40094      3.91496
>>>>       O        7.03031      4.99173     16.09885
>>>>       O        9.23544      4.56122     16.46252
>>>>       O       11.14602      4.67776     10.31440
>>>>       O       10.00982      2.79915      9.77218
>>>>       O        2.41388      0.01898     12.91899
>>>>       O        8.39375      5.66143     10.89628
>>>>       O        7.36998      3.66087     10.53589
>>>>       O        6.08863      2.22161     16.68336
>>>>       O        8.26988      1.95313     17.21650
>>>>       O       15.16937      6.16381     14.09906
>>>>       N       13.25907      3.80728      0.04001
>>>>       N        2.36335     -0.74130     17.33402
>>>>       N        7.60676      1.08576      8.95623
>>>>       N       15.77729      5.75974      9.67861
>>>>       N        4.49430      4.76652     17.95756
>>>>       N       15.38873      9.31230      0.67467
>>>>       N       10.14308      7.50848      9.04236
>>>>       N        1.96529      2.83557      8.33233
>>>>       C        6.76554      5.18292      7.68414
>>>>       C       14.28210      4.11624      0.86006
>>>>       C        9.47998      3.39622      2.09658
>>>>       C        3.20112      3.42080      0.84626
>>>>       C        9.91466      1.18589      3.17244
>>>>       C        9.08210      2.29987      3.02657
>>>>       C        5.74710      6.04945      7.01821
>>>>       C        7.83265      2.30920      3.66005
>>>>       C        3.35793      2.34328     -0.04029
>>>>       C        4.51663      1.46385     -0.02755
>>>>       C       16.24194      7.75266      5.73606
>>>>       C        4.78940      5.52817      6.14198
>>>>       C        7.40810      1.21174      4.39947
>>>>       C       16.18016      6.38244      5.49010
>>>>       C        9.48869      0.06986      3.88005
>>>>       C       11.27238      1.77457     17.14330
>>>>       C        5.77166      7.43009      7.27236
>>>>       C       11.14819      8.24901     17.58588
>>>>       C        8.22170      0.08058      4.47135
>>>>       C        0.15087      1.02286     17.07544
>>>>       C       17.16180      8.28565      6.64351
>>>>       C       10.57067      7.01060      1.31282
>>>>       C        6.72654      0.47459      8.14002
>>>>       C       10.27972      3.79035      6.89470
>>>>       C       14.15006      8.72843      8.15880
>>>>       C       11.73751      2.06868      5.82537
>>>>       C       11.38838      3.41515      5.96966
>>>>       C       10.52304      8.34339      1.98566
>>>>       C       12.16584      4.39562      5.33967
>>>>       C       14.89762      7.93801      9.04648
>>>>       C       14.86698      6.48365      9.03575
>>>>       C        2.67167      1.17044      3.27681
>>>>       C       11.52468      8.76552      2.86608
>>>>       C       13.29140      4.04007      4.60622
>>>>       C        3.78230      0.36534      3.52266
>>>>       C       12.87823      1.70260      5.12344
>>>>       C        8.27761      0.34001      9.85941
>>>>       C        9.42677      9.18364      1.73295
>>>>       C        3.27553      4.45658      9.42657
>>>>       C       13.66559      2.69775      4.53650
>>>>       C       15.77023      8.59069      9.93240
>>>>       C        1.68356      0.78491      2.36643
>>>>       C       10.98451      3.41041     10.31327
>>>>       C        3.46873      4.45681     17.14097
>>>>       C        8.27403      5.18373     15.89814
>>>>       C       14.54907      5.15099     17.15930
>>>>       C        7.83119      7.39584     14.82858
>>>>       C        8.66916      6.28563     14.97331
>>>>       C       11.99928      2.54577     10.98702
>>>>       C        9.92072      6.28547     14.34388
>>>>       C       16.54982      7.26986      0.04271
>>>>       C       15.39103      8.14919      0.03189
>>>>       C        1.50023      0.84646     12.27989
>>>>       C       12.95126      3.06908     11.86817
>>>>       C       10.34198      7.38826     13.61070
>>>>       C        1.55836      2.21699     12.52561
>>>>       C        8.25354      8.51697     14.12666
>>>>       C        6.48249      6.79770      0.85630
>>>>       C       11.97760      1.16465     10.73446
>>>>       C        6.60385      0.32218      0.42301
>>>>       C        9.52282      8.51550     13.54043
>>>>       C       17.60321      7.54791      0.92891
>>>>       C        0.58530      0.31102     11.36884
>>>>       C        7.18362      1.56332     16.68291
>>>>       C       11.01926      8.11905      9.86341
>>>>       C        7.47582      4.80132     11.10039
>>>>       C        3.59282     -0.13430      9.84955
>>>>       C        6.01179      6.51430     12.17471
>>>>       C        6.36853      5.17005     12.02942
>>>>       C        7.23131      0.22715     16.01652
>>>>       C        5.59963      4.18477     12.66234
>>>>       C        2.84614      0.65728      8.96213
>>>>       C        2.87561      2.11161      8.97508
>>>>       C       15.08536      7.39548     14.73440
>>>>       C        6.23001     -0.19920     15.13769
>>>>       C        4.47482      4.53325     13.40042
>>>>       C       13.97400      8.19851     14.48576
>>>>       C        4.87173      6.87322     12.88120
>>>>       C        9.47231      8.25578      8.14046
>>>>       C        8.32790     -0.61137     16.27301
>>>>       C       14.46698      4.13864      8.58475
>>>>       C        4.09294      5.87331     13.47165
>>>>       C        1.97640      0.00563      8.07267
>>>>       C       16.07240      7.78504     15.64417
>>>>       H       14.10215      4.93465      1.55678
>>>>       H        3.98110      3.68721      1.55899
>>>>       H       10.89072      1.19647      2.69205
>>>>       H        7.19958      3.19021      3.56839
>>>>       H        4.75923      4.45384      5.96230
>>>>       H        6.45299      1.21835      4.92062
>>>>       H       15.44211      6.00062      4.78824
>>>>       H       17.75043      8.81610      3.97156
>>>>       H       10.41563      1.57993     16.49923
>>>>       H        6.49332      7.81303      7.99143
>>>>       H        0.24800      0.19739     16.37425
>>>>       H        9.53586     -0.26872      6.84508
>>>>       H        6.19685      1.12218      7.44173
>>>>       H       13.45550      8.28133      7.44815
>>>>       H       11.11633      1.31384      6.30260
>>>>       H       11.87413      5.44074      5.42962
>>>>       H       12.38442      8.12016      3.04474
>>>>       H       13.88694      4.78876      4.08791
>>>>       H        4.53915      0.70283      4.22717
>>>>       H        0.88557      0.65625      5.03328
>>>>       H        8.96418      0.89159     10.50060
>>>>       H        8.67994      8.85961      1.01083
>>>>       H       16.35704      8.00331     10.63471
>>>>       H       13.12606      1.45212      2.16563
>>>>       H        3.64702      3.63930     16.44281
>>>>       H       13.76743      4.88477     16.44833
>>>>       H        6.85355      7.37827     15.30535
>>>>       H       10.55820      5.40745     14.43410
>>>>       H       12.97886      4.14375     12.04672
>>>>       H       11.29905      7.38966     13.09313
>>>>       H        2.29216      2.60091     13.23073
>>>>       H       -0.01303     -0.23279     14.03603
>>>>       H        7.34113      6.99275      1.49776
>>>>       H       11.26049      0.78023     10.01184
>>>>       H       17.50743      8.37258      1.63130
>>>>       H        8.21398      8.86531     11.16822
>>>>       H       11.54834      7.47018     10.56097
>>>>       H        4.28503      0.31205     10.56295
>>>>       H        6.62643      7.27289     11.69479
>>>>       H        5.89748      3.14154     12.57118
>>>>       H        5.36986      0.44461     14.95599
>>>>       H        3.88656      3.78035     13.92095
>>>>       H       13.21826      7.85764     13.78163
>>>>       H       16.85773      7.91771     12.97237
>>>>       H        8.78884      7.70469      7.49554
>>>>       H        9.07452     -0.28399     16.99402
>>>>       H        1.39009      0.59398      7.37083
>>>>       H        4.63062      7.11938     15.84758
>>>>     &END COORD
>>>>     &KIND Zn
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
>>>>       POTENTIAL GTH-PBE-q12
>>>>     &END KIND
>>>>     &KIND S
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>>       POTENTIAL GTH-PBE-q6
>>>>     &END KIND
>>>>     &KIND O
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>>>>       POTENTIAL GTH-PBE-q6
>>>>     &END KIND
>>>>     &KIND N
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
>>>>       POTENTIAL GTH-PBE-q5
>>>>     &END KIND
>>>>     &KIND C
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
>>>>       POTENTIAL GTH-PBE-q4
>>>>     &END KIND
>>>>     &KIND H
>>>>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
>>>>       POTENTIAL GTH-PBE-q1
>>>>     &END KIND
>>>>   &END SUBSYS
>>>> &END FORCE_EVAL
>>>>
>>>> &MOTION
>>>>   &MD
>>>>     ENSEMBLE NPT_I
>>>>     TEMPERATURE 298
>>>>     TIMESTEP 1.0
>>>>     STEPS 50000
>>>>     &THERMOSTAT
>>>>       TYPE NOSE
>>>>       &NOSE
>>>>         LENGTH 3
>>>>         YOSHIDA 3
>>>>         TIMECON 1000
>>>>       &END NOSE
>>>>     &END THERMOSTAT
>>>>     &BAROSTAT
>>>>       PRESSURE 1.0
>>>>       TIMECON 4000
>>>>     &END BAROSTAT
>>>>   &END MD
>>>>   &FREE_ENERGY
>>>>     METHOD METADYN
>>>>     &METADYN
>>>>       USE_PLUMED .TRUE.
>>>>       PLUMED_INPUT_FILE plumed.dat
>>>>     &END METADYN
>>>>   &END FREE_ENERGY
>>>>   &PRINT
>>>>     &TRAJECTORY
>>>>       &EACH
>>>>         MD 5
>>>>       &END EACH
>>>>     &END TRAJECTORY
>>>>     &FORCES
>>>>       UNIT eV*angstrom^-1
>>>>       &EACH
>>>>         MD 5
>>>>       &END EACH
>>>>     &END FORCES
>>>>     &CELL
>>>>       &EACH
>>>>         MD 5
>>>>       &END EACH
>>>>     &END CELL
>>>>   &END PRINT
>>>> &END MOTION
>>>> ```
>>>>
>>>> This simulation was performed with previous version of cp2k (so without 
>>>> your fix). 
>>>> piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):
>>>>
>>>>> Hi Frederick, 
>>>>>
>>>>> it helped with most of the tests! Now only 13 have failed. In the 
>>>>> attachments you will find full output from regtests and here is output from 
>>>>> single job with TRACE enabled:
>>>>>
>>>>> ```
>>>>> Loading intel/2024a
>>>>>   Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>>>>>     binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>>>>>     numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>>>>>     impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>>>>>     imkl-FFTW/2024.2.0-iimpi-2024a
>>>>>
>>>>> Currently Loaded Modulefiles:
>>>>>  1) GCCcore/13.3.0                  7) 
>>>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                   
>>>>>          
>>>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                     
>>>>>          
>>>>>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a 
>>>>>           
>>>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                     
>>>>>          
>>>>>  6) UCX/1.16.0-GCCcore-13.3.0      
>>>>> 2 MPI processes with 2 OpenMP threads each
>>>>> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
>>>>> SIRIUS 7.6.1, git hash: 
>>>>> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
>>>>> Warning! Compiled in 'debug' mode with assert statements enabled!
>>>>>
>>>>>
>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>    0..13      8      8      0      0 
>>>>>   14..23      0      0      0      0 
>>>>>   24..64      0      0      0      0 
>>>>> Registry and code: 13 MB + 64 KB (gemm=8)
>>>>> Command (PID=423503): 
>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>> dftd3src1.inp -o dftd3src1.out
>>>>> Uptime: 2.752513 s
>>>>>
>>>>>
>>>>>
>>>>> ===================================================================================
>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>> =   RANK 0 PID 423503 RUNNING AT r21c01b03
>>>>>
>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>
>>>>> ===================================================================================
>>>>>
>>>>>
>>>>> ===================================================================================
>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>> =   RANK 1 PID 423504 RUNNING AT r21c01b03
>>>>>
>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>
>>>>> ===================================================================================
>>>>> finished at Fri Oct 25 09:34:39 CEST 2024
>>>>> ```
>>>>>
>>>>> and the last lines:
>>>>>
>>>>> ```
>>>>>  000000:000002<<                                  13      3 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13      4 
>>>>> mp_sendrecv_dm2     
>>>>>    start Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13      4 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      2 
>>>>> pw_nn_compose_r       0
>>>>>  .003 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      1 xc_pw_derive     
>>>>>   0.003 H
>>>>>  ostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      5 pw_zero       
>>>>> start Hostme
>>>>>  m: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      5 pw_zero       
>>>>> 0.000 Hostme
>>>>>  m: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      2 xc_pw_derive     
>>>>>   start H
>>>>>  ostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      3 
>>>>> pw_nn_compose_r       s
>>>>>  tart Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13      5 
>>>>> mp_sendrecv_dm2     
>>>>>    start Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13      5 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13      6 
>>>>> mp_sendrecv_dm2     
>>>>>    start Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13      6 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      3 
>>>>> pw_nn_compose_r       0
>>>>>  .002 Hostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      2 xc_pw_derive     
>>>>>   0.002 H
>>>>>  ostmem: 955 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      6 pw_zero       
>>>>> start Hostme
>>>>>  m: 955 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      6 pw_zero       
>>>>> 0.001 Hostme
>>>>>  m: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      3 xc_pw_derive     
>>>>>   start H
>>>>>  ostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      4 
>>>>> pw_nn_compose_r       s
>>>>>  tart Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13      7 
>>>>> mp_sendrecv_dm2     
>>>>>    start Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13      7 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                                  13      8 
>>>>> mp_sendrecv_dm2     
>>>>>    start Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002<<                                  13      8 
>>>>> mp_sendrecv_dm2     
>>>>>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      4 
>>>>> pw_nn_compose_r       0
>>>>>  .002 Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      3 xc_pw_derive     
>>>>>   0.002 H
>>>>>  ostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      1 
>>>>> pw_spline_scale_deriv     
>>>>>    start Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      1 
>>>>> pw_spline_scale_deriv     
>>>>>    0.001 Hostmem: 960 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     20 
>>>>> pw_pool_give_back_pw      
>>>>>   start Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     20 
>>>>> pw_pool_give_back_pw      
>>>>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     21 
>>>>> pw_pool_give_back_pw      
>>>>>   start Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     21 
>>>>> pw_pool_give_back_pw      
>>>>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     22 
>>>>> pw_pool_give_back_pw      
>>>>>   start Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     22 
>>>>> pw_pool_give_back_pw      
>>>>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11     23 
>>>>> pw_pool_give_back_pw      
>>>>>   start Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11     23 
>>>>> pw_pool_give_back_pw      
>>>>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002>>                            11      1 
>>>>> xc_functional_eval       s
>>>>>  tart Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002>>                               12      1 b97_lda_eval 
>>>>>       star
>>>>>  t Hostmem: 965 MB GPUmem: 0 MB
>>>>>  000000:000002<<                               12      1 b97_lda_eval 
>>>>>       0.10
>>>>>  3 Hostmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                            11      1 
>>>>> xc_functional_eval       0
>>>>>  .103 Hostmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10      1 
>>>>> xc_rho_set_and_dset_create   
>>>>>      0.120 Hostmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10      1 
>>>>> check_for_derivatives       s
>>>>>  tart Hostmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10      1 
>>>>> check_for_derivatives       0
>>>>>  .000 Hostmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10     14 pw_create_r3d       
>>>>> start Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10     14 pw_create_r3d       
>>>>> 0.000 Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10     15 pw_create_r3d       
>>>>> start Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10     15 pw_create_r3d       
>>>>> 0.000 Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10     16 pw_create_r3d       
>>>>> start Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10     16 pw_create_r3d       
>>>>> 0.000 Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002>>                         10     17 pw_create_r3d       
>>>>> start Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>>  000000:000002<<                         10     17 pw_create_r3d       
>>>>> 0.000 Hos
>>>>>  tmem: 979 MB GPUmem: 0 MB
>>>>> ```
>>>>>
>>>>> Best
>>>>> Bartosz
>>>>>
>>>>> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein 
>>>>> napisał(a):
>>>>>
>>>>>> Dear Bartosz,
>>>>>> My fix is merged. Can you switch to the CP2K master and try it again? 
>>>>>> We are still working on a few issues with the Intel compilers such that we 
>>>>>> may eventually migrate from ifort to ifx.
>>>>>> Best,
>>>>>> Frederick
>>>>>>
>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>>>>>
>>>>>>> Great! Thank you for your help. 
>>>>>>>
>>>>>>> Best
>>>>>>> Bartosz
>>>>>>>
>>>>>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein 
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> I have a fix for it. In contrast to my first thought, it is a case 
>>>>>>>> of invalid type conversion from real to complex numbers (yes, Fortran is 
>>>>>>>> rather strict about it) in pw_derive. This may also be present in a few 
>>>>>>>> other spots. I am currently running more tests and I will open a pull 
>>>>>>>> request within the next few days.
>>>>>>>> Best,
>>>>>>>> Frederick
>>>>>>>>
>>>>>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 
>>>>>>>> UTC+2:
>>>>>>>>
>>>>>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>>>>>
>>>>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> I was loading it as it was needed for compilation. I have 
>>>>>>>>>> unloaded the module, but the error still occurs: 
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>>    0..13      2      2      0      0 
>>>>>>>>>>   14..23      0      0      0      0 
>>>>>>>>>>   24..64      0      0      0      0 
>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>> Command (PID=15485): 
>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>> Uptime: 1.757102 s
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> =   RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>>>>>
>>>>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> =   RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>>>>>
>>>>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> and the last 100 lines:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>>  000000:000002>>                            11     37 
>>>>>>>>>> pw_create_c1d       start 
>>>>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11     37 
>>>>>>>>>> pw_create_c1d       0.000 
>>>>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                         10     64 
>>>>>>>>>> pw_pool_create_pw       0.000
>>>>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                         10     25 pw_copy       
>>>>>>>>>> start Hostmem: 
>>>>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                         10     25 pw_copy       
>>>>>>>>>> 0.001 Hostmem: 
>>>>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                         10     17 pw_axpy       
>>>>>>>>>> start Hostmem: 
>>>>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                         10     17 pw_axpy       
>>>>>>>>>> 0.001 Hostmem: 
>>>>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                         10     19 mp_sum_d       
>>>>>>>>>> start Hostmem:
>>>>>>>>>>   697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                         10     19 mp_sum_d       
>>>>>>>>>> 0.000 Hostmem:
>>>>>>>>>>   697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                         10      3 
>>>>>>>>>> pw_poisson_solve       start 
>>>>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11      3 
>>>>>>>>>> pw_poisson_rebuild       s
>>>>>>>>>>  tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11      3 
>>>>>>>>>> pw_poisson_rebuild       0
>>>>>>>>>>  .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11     65 
>>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>>  art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     38 
>>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     38 
>>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>>  00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11     65 
>>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>>  000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11     26 pw_copy     
>>>>>>>>>>   start Hostme
>>>>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11     26 pw_copy     
>>>>>>>>>>   0.001 Hostme
>>>>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11      3 
>>>>>>>>>> pw_multiply_with       sta
>>>>>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11      3 
>>>>>>>>>> pw_multiply_with       0.0
>>>>>>>>>>  01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11     27 pw_copy     
>>>>>>>>>>   start Hostme
>>>>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11     27 pw_copy     
>>>>>>>>>>   0.001 Hostme
>>>>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11      3 
>>>>>>>>>> pw_integral_ab       start
>>>>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     20 mp_sum_d 
>>>>>>>>>>       start Ho
>>>>>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     20 mp_sum_d 
>>>>>>>>>>       0.001 Ho
>>>>>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                            11      3 
>>>>>>>>>> pw_integral_ab       0.004
>>>>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                            11      4 
>>>>>>>>>> pw_poisson_set       start
>>>>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     66 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     39 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13     39 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     66 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     28 pw_copy 
>>>>>>>>>>       start Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     28 pw_copy 
>>>>>>>>>>       0.001 Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>>> pw_derive       start H
>>>>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>>> pw_derive       0.002 H
>>>>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     67 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     40 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13     40 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     67 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     29 pw_copy 
>>>>>>>>>>       start Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     29 pw_copy 
>>>>>>>>>>       0.001 Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12      8 
>>>>>>>>>> pw_derive       start H
>>>>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12      8 
>>>>>>>>>> pw_derive       0.002 H
>>>>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     68 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     41 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13     41 
>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     68 
>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12     30 pw_copy 
>>>>>>>>>>       start Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                               12     30 pw_copy 
>>>>>>>>>>       0.001 Hos
>>>>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                               12      9 
>>>>>>>>>> pw_derive       start H
>>>>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>>>>  ```
>>>>>>>>>>
>>>>>>>>>> This is the list of currently loaded modules (all come with 
>>>>>>>>>> intel):
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> Currently Loaded Modulefiles:
>>>>>>>>>>  1) GCCcore/13.3.0                  7) 
>>>>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>>>>>>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0             
>>>>>>>>>>                
>>>>>>>>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a               
>>>>>>>>>>                
>>>>>>>>>>  4) intel-compilers/2024.2.0       10) 
>>>>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a           
>>>>>>>>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a               
>>>>>>>>>>                
>>>>>>>>>>  6) UCX/1.16.0-GCCcore-13.3.0    
>>>>>>>>>> ```
>>>>>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein 
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>> I am currently running some tests with the latest Intel compiler 
>>>>>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why 
>>>>>>>>>>> is it loaded? Can you unload it? This would at least reduce potential 
>>>>>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>>>>>> Best,
>>>>>>>>>>> Frederick
>>>>>>>>>>>
>>>>>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> The error for ssmp is:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>>>>    0..13      4      4      0      0 
>>>>>>>>>>>>   14..23      0      0      0      0 
>>>>>>>>>>>>   24..64      0      0      0      0 
>>>>>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>>>>>> Command (PID=54845): 
>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>> Uptime: 2.861583 s
>>>>>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 
>>>>>>>>>>>> 54845 Segmentation fault      (core dumped) 
>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> and the last 100 lines of output:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>>  000000:000001>>                               12     20 
>>>>>>>>>>>> mp_sum_d       start Ho
>>>>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12     20 
>>>>>>>>>>>> mp_sum_d       0.000 Ho
>>>>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                            11     13 
>>>>>>>>>>>> dbcsr_dot_sd       0.000 H
>>>>>>>>>>>>  ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                         10     12 
>>>>>>>>>>>> calculate_ptrace_kp       0.0
>>>>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                       9      6 
>>>>>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                       9      6 
>>>>>>>>>>>> rebuild_ks_matrix       start Ho
>>>>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                         10      6 
>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>>>>>        start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                            11    140 
>>>>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12     79 
>>>>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12     79 
>>>>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                            11    140 
>>>>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                            11    141 
>>>>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12     80 
>>>>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12     80 
>>>>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                            11    141 
>>>>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                            11     61 pw_copy   
>>>>>>>>>>>>     start Hostme
>>>>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                            11     61 pw_copy   
>>>>>>>>>>>>     0.004 Hostme
>>>>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                            11     35 pw_axpy   
>>>>>>>>>>>>     start Hostme
>>>>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                            11     35 pw_axpy   
>>>>>>>>>>>>     0.002 Hostme
>>>>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                            11      6 
>>>>>>>>>>>> pw_poisson_solve       sta
>>>>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>>>>    start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12    142 
>>>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>>>   start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13     81 
>>>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13     81 
>>>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>>>  0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12    142 
>>>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>>>   0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12     62 
>>>>>>>>>>>> pw_copy       start Hos
>>>>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12     62 
>>>>>>>>>>>> pw_copy       0.003 Hos
>>>>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>>>>> pw_multiply_with       
>>>>>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>>>>> pw_multiply_with       
>>>>>>>>>>>>  0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12     63 
>>>>>>>>>>>> pw_copy       start Hos
>>>>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12     63 
>>>>>>>>>>>> pw_copy       0.003 Hos
>>>>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>>>>> pw_integral_ab       st
>>>>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>>>>> pw_integral_ab       0.
>>>>>>>>>>>>  005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                               12      7 
>>>>>>>>>>>> pw_poisson_set       st
>>>>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13    143 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                     14     82 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                     14     82 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13    143 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13     64 
>>>>>>>>>>>> pw_copy       start 
>>>>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13     64 
>>>>>>>>>>>> pw_copy       0.003 
>>>>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13     16 
>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13     16 
>>>>>>>>>>>> pw_derive       0.00
>>>>>>>>>>>>  6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13    144 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                     14     83 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                     14     83 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13    144 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13     65 
>>>>>>>>>>>> pw_copy       start 
>>>>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001<<                                  13     65 
>>>>>>>>>>>> pw_copy       0.004 
>>>>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000001>>                                  13     17 
>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>>  000000:000002<<                       9      7 
>>>>>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                       9      7 
>>>>>>>>>>>> rebuild_ks_matrix       start Ho
>>>>>>>>>>>>
>>>>>>>>>>>>  stmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                         10      7 
>>>>>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>>>>>        start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11    164 
>>>>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12     93 
>>>>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12     93 
>>>>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                            11    164 
>>>>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11    165 
>>>>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12     94 
>>>>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12     94 
>>>>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                            11    165 
>>>>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11     73 pw_copy   
>>>>>>>>>>>>     start Hostme
>>>>>>>>>>>>
>>>>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                            11     73 pw_copy   
>>>>>>>>>>>>     0.001 Hostme
>>>>>>>>>>>>
>>>>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11     41 pw_axpy   
>>>>>>>>>>>>     start Hostme
>>>>>>>>>>>>
>>>>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                            11     41 pw_axpy   
>>>>>>>>>>>>     0.001 Hostme
>>>>>>>>>>>>
>>>>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11     52 mp_sum_d 
>>>>>>>>>>>>       start Hostm
>>>>>>>>>>>>
>>>>>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                            11     52 mp_sum_d 
>>>>>>>>>>>>       0.000 Hostm
>>>>>>>>>>>>
>>>>>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                            11      7 
>>>>>>>>>>>> pw_poisson_solve       sta
>>>>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>>>>    start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12    166 
>>>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>>>
>>>>>>>>>>>>   start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     95 
>>>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13     95 
>>>>>>>>>>>> pw_create_c1d       
>>>>>>>>>>>>  0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12    166 
>>>>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>>>>
>>>>>>>>>>>>   0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12     74 
>>>>>>>>>>>> pw_copy       start Hos
>>>>>>>>>>>>
>>>>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12     74 
>>>>>>>>>>>> pw_copy       0.001 Hos
>>>>>>>>>>>>
>>>>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>>>>> pw_multiply_with       
>>>>>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>>>>> pw_multiply_with       
>>>>>>>>>>>>  0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12     75 
>>>>>>>>>>>> pw_copy       start Hos
>>>>>>>>>>>>
>>>>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12     75 
>>>>>>>>>>>> pw_copy       0.001 Hos
>>>>>>>>>>>>
>>>>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>>>>> pw_integral_ab       st
>>>>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     53 
>>>>>>>>>>>> mp_sum_d       start
>>>>>>>>>>>>
>>>>>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13     53 
>>>>>>>>>>>> mp_sum_d       0.000
>>>>>>>>>>>>
>>>>>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>>>>> pw_integral_ab       0.
>>>>>>>>>>>>  003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                               12      8 
>>>>>>>>>>>> pw_poisson_set       st
>>>>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13    167 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                     14     96 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>
>>>>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                     14     96 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>
>>>>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13    167 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     76 
>>>>>>>>>>>> pw_copy       start 
>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13     76 
>>>>>>>>>>>> pw_copy       0.001 
>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>>>>>> pw_derive       0.00
>>>>>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>>>>>> pw_create_c1d    
>>>>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     77 
>>>>>>>>>>>> pw_copy       start 
>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002<<                                  13     77 
>>>>>>>>>>>> pw_copy       0.001 
>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>
>>>>>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick 
>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>>>>>> Regarding the trace, I do not know either as there is not much 
>>>>>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the 
>>>>>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually 
>>>>>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last 
>>>>>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces 
>>>>>>>>>>>>> with the psmp version.
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The error is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>>>>>>    0..13      2      2      0      0
>>>>>>>>>>>>>>   14..23      0      0      0      0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   24..64      0      0      0      0
>>>>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>>>>> Command (PID=2607388): 
>>>>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>> =   RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>>>> =   RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>  000000:000002<<                                  13     76 
>>>>>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>>>>>>>> pw_derive       0.00
>>>>>>>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002>>                                     14     
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002<<                                     14     
>>>>>>>>>>>>>> 97 pw_create_c1d
>>>>>>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002>>                                  13     77 
>>>>>>>>>>>>>> pw_copy       start
>>>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002<<                                  13     77 
>>>>>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein 
>>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE 
>>>>>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This 
>>>>>>>>>>>>>>> increases the size of the output file dramatically (to some million lines). 
>>>>>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 
>>>>>>>>>>>>>>> 17:09:40 UTC+2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I 
>>>>>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for 
>>>>>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp 
>>>>>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same 
>>>>>>>>>>>>>>>> setting, I provide example output as attachment. 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick 
>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 
>>>>>>>>>>>>>>>>> (add '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of 
>>>>>>>>>>>>>>>>> the ssmp?
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 
>>>>>>>>>>>>>>>>> 15:37:43 UTC+2:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> thanks again for help. So I have tested different 
>>>>>>>>>>>>>>>>>> simulation variants and I know that the problem occurs when using OMP. For 
>>>>>>>>>>>>>>>>>> MPI calculations without OMP all tests pass. I have also tested the effect 
>>>>>>>>>>>>>>>>>> of the `OMP_PROC_BIND` and `OMP_PLACES` parameters and 
>>>>>>>>>>>>>>>>>> apart from the effect on simulation time, they have no significant effect 
>>>>>>>>>>>>>>>>>> on the presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, 
>>>>>>>>>>>>>>>>>> time 
>>>>>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 
>>>>>>>>>>>>>>>>>> 130; 495min
>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 
>>>>>>>>>>>>>>>>>> 94; 484min
>>>>>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 
>>>>>>>>>>>>>>>>>> 74; 563min
>>>>>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 
>>>>>>>>>>>>>>>>>> 74; 556min
>>>>>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 
>>>>>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 
>>>>>>>>>>>>>>>>>> 4227; failed: 98; 263min
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Any ideas what I could do next to have more information 
>>>>>>>>>>>>>>>>>> about the source of the problem or maybe you see a potential solution at 
>>>>>>>>>>>>>>>>>> this stage? I would appreciate any further help. 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick 
>>>>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The 
>>>>>>>>>>>>>>>>>>> test do not run that efficiently with such a large number of threads. 2 
>>>>>>>>>>>>>>>>>>> should be sufficient.
>>>>>>>>>>>>>>>>>>> The test result suggests that most of the functionality 
>>>>>>>>>>>>>>>>>>> may work but due to a missing backtrace (or similar information), it is 
>>>>>>>>>>>>>>>>>>> hard to tell why they fail. You could also try to run some of the 
>>>>>>>>>>>>>>>>>>> single-node tests to assess the stability of CP2K.
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 
>>>>>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/48b72f1a-c321-4833-aeb9-1f747967acfcn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3144902.out
Type: application/octet-stream
Size: 9226 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0005.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3127239.out
Type: application/octet-stream
Size: 8643 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3164366.out
Type: application/octet-stream
Size: 8697 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3117616.out
Type: application/octet-stream
Size: 23776 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3098731.out
Type: application/octet-stream
Size: 8453 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241120/f3dceb3c/attachment-0009.obj>


More information about the CP2K-user mailing list