[CP2K-user] [CP2K:20818] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types

Frederick Stein f.stein at hzdr.de
Fri Oct 25 09:46:00 UTC 2024


Dear Bartosz, 
I will check the other issues with your regtests.
Regarding your latest issue, please provide more information such as an 
output file or a hint on the context. If I am supposed to retry the 
calculation on my local machine, I need all additional input files such as 
your plumed file. I can run your input file up to the point that CP2K needs 
plumed.
Best,
Frederick
bartosz mazur schrieb am Freitag, 25. Oktober 2024 um 10:15:19 UTC+2:

> I just got another error with LibXSMM, now in my regular simulation and 
> without using OpenMP. This is the error:
>
> ```
> [1729843139.920274] [r23c01b04:2913 :0]           ib_md.c:295  UCX  ERROR 
> ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed: Cannot 
> allocate memory
> [1729843139.920290] [r23c01b04:2913 :0]          ucp_mm.c:70   UCX  ERROR 
> failed to register address 0x14f0b46fc080 (host) length 7424 on 
> md[4]=mlx5_0: Input/output error (md supports: host)
>
> LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647] 
> [r23c01b04:2945 :0]           ib_md.c:295  UCX  ERROR 
> ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot 
> allocate memory
> [1729843139.932660] [r23c01b04:2945 :0]          ucp_mm.c:70   UCX  ERROR 
> failed to register address 0x1491f069e040 (host) length 8128 on 
> md[4]=mlx5_0: Input/output error (md supports: host)
>
>
> CLX/DP      TRY    JIT    STA    COL
>    0..13      4      4      0      0
>   14..23      4      4      0      0
>
>   24..64      0      0      0      0
> Registry and code: 13 MB + 80 KB (gemm=8)
> Command (PID=2913): 
> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
> cp2k.inp -o cp2k.out
> Uptime: 407633.177169 s
> ```
>
> and this is simulation input I'm using:
>
> ```
> &GLOBAL
>   PROJECT uam1o_npt_rms
>   RUN_TYPE MD
>   PRINT_LEVEL LOW
>   PREFERRED_DIAG_LIBRARY SCALAPACK
> &END GLOBAL
>
> &FORCE_EVAL
>   METHOD QUICKSTEP
>   STRESS_TENSOR ANALYTICAL
>   &DFT
>     BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
>     POTENTIAL_FILE_NAME POTENTIAL_UZH
>     &MGRID
>       CUTOFF 500
>     &END MGRID
>     &XC
>       &XC_FUNCTIONAL PBE
>       &END XC_FUNCTIONAL
>       &VDW_POTENTIAL
>         POTENTIAL_TYPE PAIR_POTENTIAL
>         &PAIR_POTENTIAL
>           TYPE  DFTD3(BJ)
>           PARAMETER_FILE_NAME  dftd3.dat
>           REFERENCE_FUNCTIONAL PBE
>           R_CUTOFF  25.0
>         &END PAIR_POTENTIAL
>       &END VDW_POTENTIAL
>     &END XC
>   &END DFT
>
>   &SUBSYS
>     &CELL
>       A      12.2807999       0.0000000       0.0000000
>       B       7.6258602       9.6257200       0.0000000
>       C      -2.1557724      -1.0420258      18.0042801
>     &END CELL
>     &COORD
>       Zn      11.37811      4.60286      0.24515
>       Zn       8.15435      3.05288      8.74518
>       Zn       6.37590      3.97311     17.74650
>       Zn       9.59842      5.54014      9.24747
>       S       11.79344      6.72692     17.10850
>       S        4.06825      3.00573      9.90358
>       S        5.95830      1.84422      0.90027
>       S       13.67407      5.58944      8.10767
>       O       10.72408      3.58291      1.89315
>       O        8.51986      4.01962      1.53085
>       O        6.60135      3.91587      7.68572
>       O        7.74637      5.79259      8.21600
>       O       15.32810      8.58246      5.10041
>       O        9.35608      2.93551      7.09500
>       O       10.38999      4.93007      7.45977
>       O       11.66491      6.35111      1.31266
>       O        9.48582      6.62478      0.77364
>       O        2.59062      2.40094      3.91496
>       O        7.03031      4.99173     16.09885
>       O        9.23544      4.56122     16.46252
>       O       11.14602      4.67776     10.31440
>       O       10.00982      2.79915      9.77218
>       O        2.41388      0.01898     12.91899
>       O        8.39375      5.66143     10.89628
>       O        7.36998      3.66087     10.53589
>       O        6.08863      2.22161     16.68336
>       O        8.26988      1.95313     17.21650
>       O       15.16937      6.16381     14.09906
>       N       13.25907      3.80728      0.04001
>       N        2.36335     -0.74130     17.33402
>       N        7.60676      1.08576      8.95623
>       N       15.77729      5.75974      9.67861
>       N        4.49430      4.76652     17.95756
>       N       15.38873      9.31230      0.67467
>       N       10.14308      7.50848      9.04236
>       N        1.96529      2.83557      8.33233
>       C        6.76554      5.18292      7.68414
>       C       14.28210      4.11624      0.86006
>       C        9.47998      3.39622      2.09658
>       C        3.20112      3.42080      0.84626
>       C        9.91466      1.18589      3.17244
>       C        9.08210      2.29987      3.02657
>       C        5.74710      6.04945      7.01821
>       C        7.83265      2.30920      3.66005
>       C        3.35793      2.34328     -0.04029
>       C        4.51663      1.46385     -0.02755
>       C       16.24194      7.75266      5.73606
>       C        4.78940      5.52817      6.14198
>       C        7.40810      1.21174      4.39947
>       C       16.18016      6.38244      5.49010
>       C        9.48869      0.06986      3.88005
>       C       11.27238      1.77457     17.14330
>       C        5.77166      7.43009      7.27236
>       C       11.14819      8.24901     17.58588
>       C        8.22170      0.08058      4.47135
>       C        0.15087      1.02286     17.07544
>       C       17.16180      8.28565      6.64351
>       C       10.57067      7.01060      1.31282
>       C        6.72654      0.47459      8.14002
>       C       10.27972      3.79035      6.89470
>       C       14.15006      8.72843      8.15880
>       C       11.73751      2.06868      5.82537
>       C       11.38838      3.41515      5.96966
>       C       10.52304      8.34339      1.98566
>       C       12.16584      4.39562      5.33967
>       C       14.89762      7.93801      9.04648
>       C       14.86698      6.48365      9.03575
>       C        2.67167      1.17044      3.27681
>       C       11.52468      8.76552      2.86608
>       C       13.29140      4.04007      4.60622
>       C        3.78230      0.36534      3.52266
>       C       12.87823      1.70260      5.12344
>       C        8.27761      0.34001      9.85941
>       C        9.42677      9.18364      1.73295
>       C        3.27553      4.45658      9.42657
>       C       13.66559      2.69775      4.53650
>       C       15.77023      8.59069      9.93240
>       C        1.68356      0.78491      2.36643
>       C       10.98451      3.41041     10.31327
>       C        3.46873      4.45681     17.14097
>       C        8.27403      5.18373     15.89814
>       C       14.54907      5.15099     17.15930
>       C        7.83119      7.39584     14.82858
>       C        8.66916      6.28563     14.97331
>       C       11.99928      2.54577     10.98702
>       C        9.92072      6.28547     14.34388
>       C       16.54982      7.26986      0.04271
>       C       15.39103      8.14919      0.03189
>       C        1.50023      0.84646     12.27989
>       C       12.95126      3.06908     11.86817
>       C       10.34198      7.38826     13.61070
>       C        1.55836      2.21699     12.52561
>       C        8.25354      8.51697     14.12666
>       C        6.48249      6.79770      0.85630
>       C       11.97760      1.16465     10.73446
>       C        6.60385      0.32218      0.42301
>       C        9.52282      8.51550     13.54043
>       C       17.60321      7.54791      0.92891
>       C        0.58530      0.31102     11.36884
>       C        7.18362      1.56332     16.68291
>       C       11.01926      8.11905      9.86341
>       C        7.47582      4.80132     11.10039
>       C        3.59282     -0.13430      9.84955
>       C        6.01179      6.51430     12.17471
>       C        6.36853      5.17005     12.02942
>       C        7.23131      0.22715     16.01652
>       C        5.59963      4.18477     12.66234
>       C        2.84614      0.65728      8.96213
>       C        2.87561      2.11161      8.97508
>       C       15.08536      7.39548     14.73440
>       C        6.23001     -0.19920     15.13769
>       C        4.47482      4.53325     13.40042
>       C       13.97400      8.19851     14.48576
>       C        4.87173      6.87322     12.88120
>       C        9.47231      8.25578      8.14046
>       C        8.32790     -0.61137     16.27301
>       C       14.46698      4.13864      8.58475
>       C        4.09294      5.87331     13.47165
>       C        1.97640      0.00563      8.07267
>       C       16.07240      7.78504     15.64417
>       H       14.10215      4.93465      1.55678
>       H        3.98110      3.68721      1.55899
>       H       10.89072      1.19647      2.69205
>       H        7.19958      3.19021      3.56839
>       H        4.75923      4.45384      5.96230
>       H        6.45299      1.21835      4.92062
>       H       15.44211      6.00062      4.78824
>       H       17.75043      8.81610      3.97156
>       H       10.41563      1.57993     16.49923
>       H        6.49332      7.81303      7.99143
>       H        0.24800      0.19739     16.37425
>       H        9.53586     -0.26872      6.84508
>       H        6.19685      1.12218      7.44173
>       H       13.45550      8.28133      7.44815
>       H       11.11633      1.31384      6.30260
>       H       11.87413      5.44074      5.42962
>       H       12.38442      8.12016      3.04474
>       H       13.88694      4.78876      4.08791
>       H        4.53915      0.70283      4.22717
>       H        0.88557      0.65625      5.03328
>       H        8.96418      0.89159     10.50060
>       H        8.67994      8.85961      1.01083
>       H       16.35704      8.00331     10.63471
>       H       13.12606      1.45212      2.16563
>       H        3.64702      3.63930     16.44281
>       H       13.76743      4.88477     16.44833
>       H        6.85355      7.37827     15.30535
>       H       10.55820      5.40745     14.43410
>       H       12.97886      4.14375     12.04672
>       H       11.29905      7.38966     13.09313
>       H        2.29216      2.60091     13.23073
>       H       -0.01303     -0.23279     14.03603
>       H        7.34113      6.99275      1.49776
>       H       11.26049      0.78023     10.01184
>       H       17.50743      8.37258      1.63130
>       H        8.21398      8.86531     11.16822
>       H       11.54834      7.47018     10.56097
>       H        4.28503      0.31205     10.56295
>       H        6.62643      7.27289     11.69479
>       H        5.89748      3.14154     12.57118
>       H        5.36986      0.44461     14.95599
>       H        3.88656      3.78035     13.92095
>       H       13.21826      7.85764     13.78163
>       H       16.85773      7.91771     12.97237
>       H        8.78884      7.70469      7.49554
>       H        9.07452     -0.28399     16.99402
>       H        1.39009      0.59398      7.37083
>       H        4.63062      7.11938     15.84758
>     &END COORD
>     &KIND Zn
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
>       POTENTIAL GTH-PBE-q12
>     &END KIND
>     &KIND S
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>       POTENTIAL GTH-PBE-q6
>     &END KIND
>     &KIND O
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
>       POTENTIAL GTH-PBE-q6
>     &END KIND
>     &KIND N
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
>       POTENTIAL GTH-PBE-q5
>     &END KIND
>     &KIND C
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
>       POTENTIAL GTH-PBE-q4
>     &END KIND
>     &KIND H
>       BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
>       POTENTIAL GTH-PBE-q1
>     &END KIND
>   &END SUBSYS
> &END FORCE_EVAL
>
> &MOTION
>   &MD
>     ENSEMBLE NPT_I
>     TEMPERATURE 298
>     TIMESTEP 1.0
>     STEPS 50000
>     &THERMOSTAT
>       TYPE NOSE
>       &NOSE
>         LENGTH 3
>         YOSHIDA 3
>         TIMECON 1000
>       &END NOSE
>     &END THERMOSTAT
>     &BAROSTAT
>       PRESSURE 1.0
>       TIMECON 4000
>     &END BAROSTAT
>   &END MD
>   &FREE_ENERGY
>     METHOD METADYN
>     &METADYN
>       USE_PLUMED .TRUE.
>       PLUMED_INPUT_FILE plumed.dat
>     &END METADYN
>   &END FREE_ENERGY
>   &PRINT
>     &TRAJECTORY
>       &EACH
>         MD 5
>       &END EACH
>     &END TRAJECTORY
>     &FORCES
>       UNIT eV*angstrom^-1
>       &EACH
>         MD 5
>       &END EACH
>     &END FORCES
>     &CELL
>       &EACH
>         MD 5
>       &END EACH
>     &END CELL
>   &END PRINT
> &END MOTION
> ```
>
> This simulation was performed with previous version of cp2k (so without 
> your fix). 
> piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):
>
>> Hi Frederick, 
>>
>> it helped with most of the tests! Now only 13 have failed. In the 
>> attachments you will find full output from regtests and here is output from 
>> single job with TRACE enabled:
>>
>> ```
>> Loading intel/2024a
>>   Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>>     binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>>     numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>>     impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>>     imkl-FFTW/2024.2.0-iimpi-2024a
>>
>> Currently Loaded Modulefiles:
>>  1) GCCcore/13.3.0                  7) 
>> impi/2021.13.0-intel-compilers-2024.2.0  
>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                     
>>        
>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                       
>>        
>>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a     
>>       
>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                       
>>        
>>  6) UCX/1.16.0-GCCcore-13.3.0      
>> 2 MPI processes with 2 OpenMP threads each
>> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
>> SIRIUS 7.6.1, git hash: 
>> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
>> Warning! Compiled in 'debug' mode with assert statements enabled!
>>
>>
>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>> CLX/DP      TRY    JIT    STA    COL
>>    0..13      8      8      0      0 
>>   14..23      0      0      0      0 
>>   24..64      0      0      0      0 
>> Registry and code: 13 MB + 64 KB (gemm=8)
>> Command (PID=423503): 
>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>> dftd3src1.inp -o dftd3src1.out
>> Uptime: 2.752513 s
>>
>>
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   RANK 0 PID 423503 RUNNING AT r21c01b03
>>
>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>
>> ===================================================================================
>>
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   RANK 1 PID 423504 RUNNING AT r21c01b03
>>
>> =   KILLED BY SIGNAL: 9 (Killed)
>>
>> ===================================================================================
>> finished at Fri Oct 25 09:34:39 CEST 2024
>> ```
>>
>> and the last lines:
>>
>> ```
>>  000000:000002<<                                  13      3 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                                  13      4 
>> mp_sendrecv_dm2     
>>    start Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                                  13      4 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                               12      2 pw_nn_compose_r 
>>       0
>>  .003 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                            11      1 xc_pw_derive       
>> 0.003 H
>>  ostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                            11      5 pw_zero       start 
>> Hostme
>>  m: 955 MB GPUmem: 0 MB
>>  000000:000002<<                            11      5 pw_zero       0.000 
>> Hostme
>>  m: 955 MB GPUmem: 0 MB
>>  000000:000002>>                            11      2 xc_pw_derive       
>> start H
>>  ostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                               12      3 pw_nn_compose_r 
>>       s
>>  tart Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                                  13      5 
>> mp_sendrecv_dm2     
>>    start Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                                  13      5 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                                  13      6 
>> mp_sendrecv_dm2     
>>    start Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                                  13      6 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                               12      3 pw_nn_compose_r 
>>       0
>>  .002 Hostmem: 955 MB GPUmem: 0 MB
>>  000000:000002<<                            11      2 xc_pw_derive       
>> 0.002 H
>>  ostmem: 955 MB GPUmem: 0 MB
>>  000000:000002>>                            11      6 pw_zero       start 
>> Hostme
>>  m: 955 MB GPUmem: 0 MB
>>  000000:000002<<                            11      6 pw_zero       0.001 
>> Hostme
>>  m: 960 MB GPUmem: 0 MB
>>  000000:000002>>                            11      3 xc_pw_derive       
>> start H
>>  ostmem: 960 MB GPUmem: 0 MB
>>  000000:000002>>                               12      4 pw_nn_compose_r 
>>       s
>>  tart Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002>>                                  13      7 
>> mp_sendrecv_dm2     
>>    start Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002<<                                  13      7 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002>>                                  13      8 
>> mp_sendrecv_dm2     
>>    start Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002<<                                  13      8 
>> mp_sendrecv_dm2     
>>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002<<                               12      4 pw_nn_compose_r 
>>       0
>>  .002 Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002<<                            11      3 xc_pw_derive       
>> 0.002 H
>>  ostmem: 960 MB GPUmem: 0 MB
>>  000000:000002>>                            11      1 
>> pw_spline_scale_deriv     
>>    start Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002<<                            11      1 
>> pw_spline_scale_deriv     
>>    0.001 Hostmem: 960 MB GPUmem: 0 MB
>>  000000:000002>>                            11     20 
>> pw_pool_give_back_pw      
>>   start Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002<<                            11     20 
>> pw_pool_give_back_pw      
>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002>>                            11     21 
>> pw_pool_give_back_pw      
>>   start Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002<<                            11     21 
>> pw_pool_give_back_pw      
>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002>>                            11     22 
>> pw_pool_give_back_pw      
>>   start Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002<<                            11     22 
>> pw_pool_give_back_pw      
>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002>>                            11     23 
>> pw_pool_give_back_pw      
>>   start Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002<<                            11     23 
>> pw_pool_give_back_pw      
>>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002>>                            11      1 xc_functional_eval 
>>       s
>>  tart Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002>>                               12      1 b97_lda_eval     
>>   star
>>  t Hostmem: 965 MB GPUmem: 0 MB
>>  000000:000002<<                               12      1 b97_lda_eval     
>>   0.10
>>  3 Hostmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                            11      1 xc_functional_eval 
>>       0
>>  .103 Hostmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10      1 
>> xc_rho_set_and_dset_create   
>>      0.120 Hostmem: 979 MB GPUmem: 0 MB
>>  000000:000002>>                         10      1 check_for_derivatives 
>>       s
>>  tart Hostmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10      1 check_for_derivatives 
>>       0
>>  .000 Hostmem: 979 MB GPUmem: 0 MB
>>  000000:000002>>                         10     14 pw_create_r3d       
>> start Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10     14 pw_create_r3d       
>> 0.000 Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002>>                         10     15 pw_create_r3d       
>> start Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10     15 pw_create_r3d       
>> 0.000 Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002>>                         10     16 pw_create_r3d       
>> start Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10     16 pw_create_r3d       
>> 0.000 Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002>>                         10     17 pw_create_r3d       
>> start Hos
>>  tmem: 979 MB GPUmem: 0 MB
>>  000000:000002<<                         10     17 pw_create_r3d       
>> 0.000 Hos
>>  tmem: 979 MB GPUmem: 0 MB
>> ```
>>
>> Best
>> Bartosz
>>
>> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein napisał(a):
>>
>>> Dear Bartosz,
>>> My fix is merged. Can you switch to the CP2K master and try it again? We 
>>> are still working on a few issues with the Intel compilers such that we may 
>>> eventually migrate from ifort to ifx.
>>> Best,
>>> Frederick
>>>
>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>>
>>>> Great! Thank you for your help. 
>>>>
>>>> Best
>>>> Bartosz
>>>>
>>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein 
>>>> napisał(a):
>>>>
>>>>> I have a fix for it. In contrast to my first thought, it is a case of 
>>>>> invalid type conversion from real to complex numbers (yes, Fortran is 
>>>>> rather strict about it) in pw_derive. This may also be present in a few 
>>>>> other spots. I am currently running more tests and I will open a pull 
>>>>> request within the next few days.
>>>>> Best,
>>>>> Frederick
>>>>>
>>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 
>>>>> UTC+2:
>>>>>
>>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>>
>>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>>>>>
>>>>>>> I was loading it as it was needed for compilation. I have unloaded 
>>>>>>> the module, but the error still occurs: 
>>>>>>>
>>>>>>> ```
>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>    0..13      2      2      0      0 
>>>>>>>   14..23      0      0      0      0 
>>>>>>>   24..64      0      0      0      0 
>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>> Command (PID=15485): 
>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>> Uptime: 1.757102 s
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> =   RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>>
>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>>
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>> =   RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>>
>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>
>>>>>>> ===================================================================================
>>>>>>> ```
>>>>>>>
>>>>>>>
>>>>>>> and the last 100 lines:
>>>>>>>
>>>>>>> ```
>>>>>>>  000000:000002>>                            11     37 pw_create_c1d 
>>>>>>>       start 
>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11     37 pw_create_c1d 
>>>>>>>       0.000 
>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                         10     64 pw_pool_create_pw 
>>>>>>>       0.000
>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                         10     25 pw_copy       
>>>>>>> start Hostmem: 
>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                         10     25 pw_copy       
>>>>>>> 0.001 Hostmem: 
>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                         10     17 pw_axpy       
>>>>>>> start Hostmem: 
>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                         10     17 pw_axpy       
>>>>>>> 0.001 Hostmem: 
>>>>>>>  697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                         10     19 mp_sum_d       
>>>>>>> start Hostmem:
>>>>>>>   697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                         10     19 mp_sum_d       
>>>>>>> 0.000 Hostmem:
>>>>>>>   697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                         10      3 pw_poisson_solve 
>>>>>>>       start 
>>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11      3 
>>>>>>> pw_poisson_rebuild       s
>>>>>>>  tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11      3 
>>>>>>> pw_poisson_rebuild       0
>>>>>>>  .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11     65 
>>>>>>> pw_pool_create_pw       st
>>>>>>>  art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     38 
>>>>>>> pw_create_c1d       sta
>>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     38 
>>>>>>> pw_create_c1d       0.0
>>>>>>>  00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11     65 
>>>>>>> pw_pool_create_pw       0.
>>>>>>>  000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11     26 pw_copy       
>>>>>>> start Hostme
>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11     26 pw_copy       
>>>>>>> 0.001 Hostme
>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11      3 
>>>>>>> pw_multiply_with       sta
>>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11      3 
>>>>>>> pw_multiply_with       0.0
>>>>>>>  01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11     27 pw_copy       
>>>>>>> start Hostme
>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11     27 pw_copy       
>>>>>>> 0.001 Hostme
>>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11      3 pw_integral_ab 
>>>>>>>       start
>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     20 mp_sum_d   
>>>>>>>     start Ho
>>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     20 mp_sum_d   
>>>>>>>     0.001 Ho
>>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                            11      3 pw_integral_ab 
>>>>>>>       0.004
>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                            11      4 pw_poisson_set 
>>>>>>>       start
>>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     66 
>>>>>>> pw_pool_create_pw      
>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     39 
>>>>>>> pw_create_c1d       
>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13     39 
>>>>>>> pw_create_c1d       
>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     66 
>>>>>>> pw_pool_create_pw      
>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     28 pw_copy     
>>>>>>>   start Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     28 pw_copy     
>>>>>>>   0.001 Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12      7 pw_derive   
>>>>>>>     start H
>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12      7 pw_derive   
>>>>>>>     0.002 H
>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     67 
>>>>>>> pw_pool_create_pw      
>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     40 
>>>>>>> pw_create_c1d       
>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13     40 
>>>>>>> pw_create_c1d       
>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     67 
>>>>>>> pw_pool_create_pw      
>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     29 pw_copy     
>>>>>>>   start Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     29 pw_copy     
>>>>>>>   0.001 Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12      8 pw_derive   
>>>>>>>     start H
>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12      8 pw_derive   
>>>>>>>     0.002 H
>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     68 
>>>>>>> pw_pool_create_pw      
>>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                                  13     41 
>>>>>>> pw_create_c1d       
>>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                                  13     41 
>>>>>>> pw_create_c1d       
>>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     68 
>>>>>>> pw_pool_create_pw      
>>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12     30 pw_copy     
>>>>>>>   start Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002<<                               12     30 pw_copy     
>>>>>>>   0.001 Hos
>>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>>  000000:000002>>                               12      9 pw_derive   
>>>>>>>     start H
>>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>>  ```
>>>>>>>
>>>>>>> This is the list of currently loaded modules (all come with intel):
>>>>>>>
>>>>>>> ```
>>>>>>> Currently Loaded Modulefiles:
>>>>>>>  1) GCCcore/13.3.0                  7) 
>>>>>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>>>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                 
>>>>>>>            
>>>>>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                   
>>>>>>>            
>>>>>>>  4) intel-compilers/2024.2.0       10) 
>>>>>>> imkl-FFTW/2024.2.0-iimpi-2024a           
>>>>>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                   
>>>>>>>            
>>>>>>>  6) UCX/1.16.0-GCCcore-13.3.0    
>>>>>>> ```
>>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein 
>>>>>>> napisał(a):
>>>>>>>
>>>>>>>> Dear Bartosz,
>>>>>>>> I am currently running some tests with the latest Intel compiler 
>>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why 
>>>>>>>> is it loaded? Can you unload it? This would at least reduce potential 
>>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>>> Best,
>>>>>>>> Frederick
>>>>>>>>
>>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>>>>>
>>>>>>>>> The error for ssmp is:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>    0..13      4      4      0      0 
>>>>>>>>>   14..23      0      0      0      0 
>>>>>>>>>   24..64      0      0      0      0 
>>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>>> Command (PID=54845): 
>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>> Uptime: 2.861583 s
>>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 
>>>>>>>>> 54845 Segmentation fault      (core dumped) 
>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> and the last 100 lines of output:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>>  000000:000001>>                               12     20 mp_sum_d 
>>>>>>>>>       start Ho
>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12     20 mp_sum_d 
>>>>>>>>>       0.000 Ho
>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                            11     13 dbcsr_dot_sd 
>>>>>>>>>       0.000 H
>>>>>>>>>  ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                         10     12 
>>>>>>>>> calculate_ptrace_kp       0.0
>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                       9      6 
>>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                       9      6 rebuild_ks_matrix 
>>>>>>>>>       start Ho
>>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                         10      6 
>>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>>        start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                            11    140 
>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12     79 
>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12     79 
>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                            11    140 
>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                            11    141 
>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12     80 
>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12     80 
>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                            11    141 
>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                            11     61 pw_copy     
>>>>>>>>>   start Hostme
>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                            11     61 pw_copy     
>>>>>>>>>   0.004 Hostme
>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                            11     35 pw_axpy     
>>>>>>>>>   start Hostme
>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                            11     35 pw_axpy     
>>>>>>>>>   0.002 Hostme
>>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                            11      6 
>>>>>>>>> pw_poisson_solve       sta
>>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>    start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12    142 
>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>   start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13     81 
>>>>>>>>> pw_create_c1d       
>>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13     81 
>>>>>>>>> pw_create_c1d       
>>>>>>>>>  0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12    142 
>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>   0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12     62 pw_copy   
>>>>>>>>>     start Hos
>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12     62 pw_copy   
>>>>>>>>>     0.003 Hos
>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>> pw_multiply_with       
>>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>> pw_multiply_with       
>>>>>>>>>  0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12     63 pw_copy   
>>>>>>>>>     start Hos
>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12     63 pw_copy   
>>>>>>>>>     0.003 Hos
>>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>>> pw_integral_ab       st
>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>>> pw_integral_ab       0.
>>>>>>>>>  005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                               12      7 
>>>>>>>>> pw_poisson_set       st
>>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13    143 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                     14     82 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                     14     82 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13    143 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13     64 
>>>>>>>>> pw_copy       start 
>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13     64 
>>>>>>>>> pw_copy       0.003 
>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13     16 
>>>>>>>>> pw_derive       star
>>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13     16 
>>>>>>>>> pw_derive       0.00
>>>>>>>>>  6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13    144 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                     14     83 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                     14     83 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13    144 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13     65 
>>>>>>>>> pw_copy       start 
>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001<<                                  13     65 
>>>>>>>>> pw_copy       0.004 
>>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>>  000000:000001>>                                  13     17 
>>>>>>>>> pw_derive       star
>>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>>  000000:000002<<                       9      7 
>>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                       9      7 rebuild_ks_matrix 
>>>>>>>>>       start Ho
>>>>>>>>>
>>>>>>>>>  stmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                         10      7 
>>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>>        start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11    164 
>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12     93 
>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12     93 
>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                            11    164 
>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11    165 
>>>>>>>>> pw_pool_create_pw       st
>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12     94 
>>>>>>>>> pw_create_c1d       sta
>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12     94 
>>>>>>>>> pw_create_c1d       0.0
>>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                            11    165 
>>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11     73 pw_copy     
>>>>>>>>>   start Hostme
>>>>>>>>>
>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                            11     73 pw_copy     
>>>>>>>>>   0.001 Hostme
>>>>>>>>>
>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11     41 pw_axpy     
>>>>>>>>>   start Hostme
>>>>>>>>>
>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                            11     41 pw_axpy     
>>>>>>>>>   0.001 Hostme
>>>>>>>>>
>>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11     52 mp_sum_d     
>>>>>>>>>   start Hostm
>>>>>>>>>
>>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                            11     52 mp_sum_d     
>>>>>>>>>   0.000 Hostm
>>>>>>>>>
>>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                            11      7 
>>>>>>>>> pw_poisson_solve       sta
>>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>    start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>> pw_poisson_rebuild     
>>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12    166 
>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>
>>>>>>>>>   start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     95 
>>>>>>>>> pw_create_c1d       
>>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13     95 
>>>>>>>>> pw_create_c1d       
>>>>>>>>>  0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12    166 
>>>>>>>>> pw_pool_create_pw      
>>>>>>>>>
>>>>>>>>>   0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12     74 pw_copy   
>>>>>>>>>     start Hos
>>>>>>>>>
>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12     74 pw_copy   
>>>>>>>>>     0.001 Hos
>>>>>>>>>
>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>> pw_multiply_with       
>>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>> pw_multiply_with       
>>>>>>>>>  0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12     75 pw_copy   
>>>>>>>>>     start Hos
>>>>>>>>>
>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12     75 pw_copy   
>>>>>>>>>     0.001 Hos
>>>>>>>>>
>>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>>> pw_integral_ab       st
>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     53 
>>>>>>>>> mp_sum_d       start
>>>>>>>>>
>>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13     53 
>>>>>>>>> mp_sum_d       0.000
>>>>>>>>>
>>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>>> pw_integral_ab       0.
>>>>>>>>>  003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                               12      8 
>>>>>>>>> pw_poisson_set       st
>>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13    167 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                     14     96 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>
>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                     14     96 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>
>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13    167 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     76 
>>>>>>>>> pw_copy       start 
>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13     76 
>>>>>>>>> pw_copy       0.001 
>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>>> pw_derive       star
>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>>> pw_derive       0.00
>>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>>> pw_create_c1d    
>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>>> pw_pool_create_pw   
>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     77 
>>>>>>>>> pw_copy       start 
>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002<<                                  13     77 
>>>>>>>>> pw_copy       0.001 
>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>>> pw_derive       star
>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Bartosz
>>>>>>>>>
>>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick 
>>>>>>>>> Stein napisał(a):
>>>>>>>>>
>>>>>>>>>> Dear Bartosz,
>>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>>> Regarding the trace, I do not know either as there is not much 
>>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the 
>>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually 
>>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last 
>>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces 
>>>>>>>>>> with the psmp version.
>>>>>>>>>> Best,
>>>>>>>>>> Frederick
>>>>>>>>>>
>>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 
>>>>>>>>>> UTC+2:
>>>>>>>>>>
>>>>>>>>>>> The error is:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>>>    0..13      2      2      0      0
>>>>>>>>>>>   14..23      0      0      0      0
>>>>>>>>>>>
>>>>>>>>>>>   24..64      0      0      0      0
>>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>>> Command (PID=2607388): 
>>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>> =   RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>>
>>>>>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>> =   RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================================
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>>  000000:000002<<                                  13     76 
>>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>>>>> pw_derive       0.00
>>>>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>>>>> pw_create_c1d
>>>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002>>                                  13     77 
>>>>>>>>>>> pw_copy       start
>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002<<                                  13     77 
>>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>>>>> pw_derive       star
>>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein 
>>>>>>>>>>> napisał(a):
>>>>>>>>>>>
>>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE 
>>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This 
>>>>>>>>>>>> increases the size of the output file dramatically (to some million lines). 
>>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40 
>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I 
>>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for 
>>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp 
>>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same 
>>>>>>>>>>>>> setting, I provide example output as attachment. 
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>
>>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein 
>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 
>>>>>>>>>>>>>> (add '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of 
>>>>>>>>>>>>>> the ssmp?
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 
>>>>>>>>>>>>>> 15:37:43 UTC+2:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks again for help. So I have tested different simulation 
>>>>>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI 
>>>>>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of 
>>>>>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart 
>>>>>>>>>>>>>>> from the effect on simulation time, they have no significant effect on the 
>>>>>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, 
>>>>>>>>>>>>>>> time 
>>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130; 
>>>>>>>>>>>>>>> 495min
>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94; 
>>>>>>>>>>>>>>> 484min
>>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>>>>> 563min
>>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>>>>> 556min
>>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 
>>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227; 
>>>>>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any ideas what I could do next to have more information 
>>>>>>>>>>>>>>> about the source of the problem or maybe you see a potential solution at 
>>>>>>>>>>>>>>> this stage? I would appreciate any further help. 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick 
>>>>>>>>>>>>>>> Stein napisał(a):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test 
>>>>>>>>>>>>>>>> do not run that efficiently with such a large number of threads. 2 should 
>>>>>>>>>>>>>>>> be sufficient.
>>>>>>>>>>>>>>>> The test result suggests that most of the functionality may 
>>>>>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to 
>>>>>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests 
>>>>>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 
>>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/5473442a-c035-4d51-833f-4c340767ee66n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241025/ee0a2357/attachment-0001.htm>


More information about the CP2K-user mailing list