[CP2K-user] [CP2K:20813] Re: compilation problems - LHS and RHS of an assignment statement have incompatible types

bartosz mazur bamaz.97 at gmail.com
Fri Oct 25 08:07:03 UTC 2024


I just got another error with LibXSMM, now in my regular simulation and 
without using OpenMP. This is the error:

```
[1729843139.920274] [r23c01b04:2913 :0]           ib_md.c:295  UCX  ERROR 
ibv_reg_mr(address=0x14f0b46fc080, length=7424, access=0xf) failed: Cannot 
allocate memory
[1729843139.920290] [r23c01b04:2913 :0]          ucp_mm.c:70   UCX  ERROR 
failed to register address 0x14f0b46fc080 (host) length 7424 on 
md[4]=mlx5_0: Input/output error (md supports: host)

LIBXSMM_VERSION: develop-1.17-3834 (25693946)[1729843139.932647] 
[r23c01b04:2945 :0]           ib_md.c:295  UCX  ERROR 
ibv_reg_mr(address=0x1491f069e040, length=8128, access=0xf) failed: Cannot 
allocate memory
[1729843139.932660] [r23c01b04:2945 :0]          ucp_mm.c:70   UCX  ERROR 
failed to register address 0x1491f069e040 (host) length 8128 on 
md[4]=mlx5_0: Input/output error (md supports: host)

CLX/DP      TRY    JIT    STA    COL
   0..13      4      4      0      0
  14..23      4      4      0      0
  24..64      0      0      0      0
Registry and code: 13 MB + 80 KB (gemm=8)
Command (PID=2913): 
/lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
cp2k.inp -o cp2k.out
Uptime: 407633.177169 s
```

and this is simulation input I'm using:

```
&GLOBAL
  PROJECT uam1o_npt_rms
  RUN_TYPE MD
  PRINT_LEVEL LOW
  PREFERRED_DIAG_LIBRARY SCALAPACK
&END GLOBAL

&FORCE_EVAL
  METHOD QUICKSTEP
  STRESS_TENSOR ANALYTICAL
  &DFT
    BASIS_SET_FILE_NAME BASIS_MOLOPT_UZH
    POTENTIAL_FILE_NAME POTENTIAL_UZH
    &MGRID
      CUTOFF 500
    &END MGRID
    &XC
      &XC_FUNCTIONAL PBE
      &END XC_FUNCTIONAL
      &VDW_POTENTIAL
        POTENTIAL_TYPE PAIR_POTENTIAL
        &PAIR_POTENTIAL
          TYPE  DFTD3(BJ)
          PARAMETER_FILE_NAME  dftd3.dat
          REFERENCE_FUNCTIONAL PBE
          R_CUTOFF  25.0
        &END PAIR_POTENTIAL
      &END VDW_POTENTIAL
    &END XC
  &END DFT

  &SUBSYS
    &CELL
      A      12.2807999       0.0000000       0.0000000
      B       7.6258602       9.6257200       0.0000000
      C      -2.1557724      -1.0420258      18.0042801
    &END CELL
    &COORD
      Zn      11.37811      4.60286      0.24515
      Zn       8.15435      3.05288      8.74518
      Zn       6.37590      3.97311     17.74650
      Zn       9.59842      5.54014      9.24747
      S       11.79344      6.72692     17.10850
      S        4.06825      3.00573      9.90358
      S        5.95830      1.84422      0.90027
      S       13.67407      5.58944      8.10767
      O       10.72408      3.58291      1.89315
      O        8.51986      4.01962      1.53085
      O        6.60135      3.91587      7.68572
      O        7.74637      5.79259      8.21600
      O       15.32810      8.58246      5.10041
      O        9.35608      2.93551      7.09500
      O       10.38999      4.93007      7.45977
      O       11.66491      6.35111      1.31266
      O        9.48582      6.62478      0.77364
      O        2.59062      2.40094      3.91496
      O        7.03031      4.99173     16.09885
      O        9.23544      4.56122     16.46252
      O       11.14602      4.67776     10.31440
      O       10.00982      2.79915      9.77218
      O        2.41388      0.01898     12.91899
      O        8.39375      5.66143     10.89628
      O        7.36998      3.66087     10.53589
      O        6.08863      2.22161     16.68336
      O        8.26988      1.95313     17.21650
      O       15.16937      6.16381     14.09906
      N       13.25907      3.80728      0.04001
      N        2.36335     -0.74130     17.33402
      N        7.60676      1.08576      8.95623
      N       15.77729      5.75974      9.67861
      N        4.49430      4.76652     17.95756
      N       15.38873      9.31230      0.67467
      N       10.14308      7.50848      9.04236
      N        1.96529      2.83557      8.33233
      C        6.76554      5.18292      7.68414
      C       14.28210      4.11624      0.86006
      C        9.47998      3.39622      2.09658
      C        3.20112      3.42080      0.84626
      C        9.91466      1.18589      3.17244
      C        9.08210      2.29987      3.02657
      C        5.74710      6.04945      7.01821
      C        7.83265      2.30920      3.66005
      C        3.35793      2.34328     -0.04029
      C        4.51663      1.46385     -0.02755
      C       16.24194      7.75266      5.73606
      C        4.78940      5.52817      6.14198
      C        7.40810      1.21174      4.39947
      C       16.18016      6.38244      5.49010
      C        9.48869      0.06986      3.88005
      C       11.27238      1.77457     17.14330
      C        5.77166      7.43009      7.27236
      C       11.14819      8.24901     17.58588
      C        8.22170      0.08058      4.47135
      C        0.15087      1.02286     17.07544
      C       17.16180      8.28565      6.64351
      C       10.57067      7.01060      1.31282
      C        6.72654      0.47459      8.14002
      C       10.27972      3.79035      6.89470
      C       14.15006      8.72843      8.15880
      C       11.73751      2.06868      5.82537
      C       11.38838      3.41515      5.96966
      C       10.52304      8.34339      1.98566
      C       12.16584      4.39562      5.33967
      C       14.89762      7.93801      9.04648
      C       14.86698      6.48365      9.03575
      C        2.67167      1.17044      3.27681
      C       11.52468      8.76552      2.86608
      C       13.29140      4.04007      4.60622
      C        3.78230      0.36534      3.52266
      C       12.87823      1.70260      5.12344
      C        8.27761      0.34001      9.85941
      C        9.42677      9.18364      1.73295
      C        3.27553      4.45658      9.42657
      C       13.66559      2.69775      4.53650
      C       15.77023      8.59069      9.93240
      C        1.68356      0.78491      2.36643
      C       10.98451      3.41041     10.31327
      C        3.46873      4.45681     17.14097
      C        8.27403      5.18373     15.89814
      C       14.54907      5.15099     17.15930
      C        7.83119      7.39584     14.82858
      C        8.66916      6.28563     14.97331
      C       11.99928      2.54577     10.98702
      C        9.92072      6.28547     14.34388
      C       16.54982      7.26986      0.04271
      C       15.39103      8.14919      0.03189
      C        1.50023      0.84646     12.27989
      C       12.95126      3.06908     11.86817
      C       10.34198      7.38826     13.61070
      C        1.55836      2.21699     12.52561
      C        8.25354      8.51697     14.12666
      C        6.48249      6.79770      0.85630
      C       11.97760      1.16465     10.73446
      C        6.60385      0.32218      0.42301
      C        9.52282      8.51550     13.54043
      C       17.60321      7.54791      0.92891
      C        0.58530      0.31102     11.36884
      C        7.18362      1.56332     16.68291
      C       11.01926      8.11905      9.86341
      C        7.47582      4.80132     11.10039
      C        3.59282     -0.13430      9.84955
      C        6.01179      6.51430     12.17471
      C        6.36853      5.17005     12.02942
      C        7.23131      0.22715     16.01652
      C        5.59963      4.18477     12.66234
      C        2.84614      0.65728      8.96213
      C        2.87561      2.11161      8.97508
      C       15.08536      7.39548     14.73440
      C        6.23001     -0.19920     15.13769
      C        4.47482      4.53325     13.40042
      C       13.97400      8.19851     14.48576
      C        4.87173      6.87322     12.88120
      C        9.47231      8.25578      8.14046
      C        8.32790     -0.61137     16.27301
      C       14.46698      4.13864      8.58475
      C        4.09294      5.87331     13.47165
      C        1.97640      0.00563      8.07267
      C       16.07240      7.78504     15.64417
      H       14.10215      4.93465      1.55678
      H        3.98110      3.68721      1.55899
      H       10.89072      1.19647      2.69205
      H        7.19958      3.19021      3.56839
      H        4.75923      4.45384      5.96230
      H        6.45299      1.21835      4.92062
      H       15.44211      6.00062      4.78824
      H       17.75043      8.81610      3.97156
      H       10.41563      1.57993     16.49923
      H        6.49332      7.81303      7.99143
      H        0.24800      0.19739     16.37425
      H        9.53586     -0.26872      6.84508
      H        6.19685      1.12218      7.44173
      H       13.45550      8.28133      7.44815
      H       11.11633      1.31384      6.30260
      H       11.87413      5.44074      5.42962
      H       12.38442      8.12016      3.04474
      H       13.88694      4.78876      4.08791
      H        4.53915      0.70283      4.22717
      H        0.88557      0.65625      5.03328
      H        8.96418      0.89159     10.50060
      H        8.67994      8.85961      1.01083
      H       16.35704      8.00331     10.63471
      H       13.12606      1.45212      2.16563
      H        3.64702      3.63930     16.44281
      H       13.76743      4.88477     16.44833
      H        6.85355      7.37827     15.30535
      H       10.55820      5.40745     14.43410
      H       12.97886      4.14375     12.04672
      H       11.29905      7.38966     13.09313
      H        2.29216      2.60091     13.23073
      H       -0.01303     -0.23279     14.03603
      H        7.34113      6.99275      1.49776
      H       11.26049      0.78023     10.01184
      H       17.50743      8.37258      1.63130
      H        8.21398      8.86531     11.16822
      H       11.54834      7.47018     10.56097
      H        4.28503      0.31205     10.56295
      H        6.62643      7.27289     11.69479
      H        5.89748      3.14154     12.57118
      H        5.36986      0.44461     14.95599
      H        3.88656      3.78035     13.92095
      H       13.21826      7.85764     13.78163
      H       16.85773      7.91771     12.97237
      H        8.78884      7.70469      7.49554
      H        9.07452     -0.28399     16.99402
      H        1.39009      0.59398      7.37083
      H        4.63062      7.11938     15.84758
    &END COORD
    &KIND Zn
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q12
      POTENTIAL GTH-PBE-q12
    &END KIND
    &KIND S
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
      POTENTIAL GTH-PBE-q6
    &END KIND
    &KIND O
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q6
      POTENTIAL GTH-PBE-q6
    &END KIND
    &KIND N
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q5
      POTENTIAL GTH-PBE-q5
    &END KIND
    &KIND C
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q4
      POTENTIAL GTH-PBE-q4
    &END KIND
    &KIND H
      BASIS_SET TZVP-MOLOPT-PBE-GTH-q1
      POTENTIAL GTH-PBE-q1
    &END KIND
  &END SUBSYS
&END FORCE_EVAL

&MOTION
  &MD
    ENSEMBLE NPT_I
    TEMPERATURE 298
    TIMESTEP 1.0
    STEPS 50000
    &THERMOSTAT
      TYPE NOSE
      &NOSE
        LENGTH 3
        YOSHIDA 3
        TIMECON 1000
      &END NOSE
    &END THERMOSTAT
    &BAROSTAT
      PRESSURE 1.0
      TIMECON 4000
    &END BAROSTAT
  &END MD
  &FREE_ENERGY
    METHOD METADYN
    &METADYN
      USE_PLUMED .TRUE.
      PLUMED_INPUT_FILE plumed.dat
    &END METADYN
  &END FREE_ENERGY
  &PRINT
    &TRAJECTORY
      &EACH
        MD 5
      &END EACH
    &END TRAJECTORY
    &FORCES
      UNIT eV*angstrom^-1
      &EACH
        MD 5
      &END EACH
    &END FORCES
    &CELL
      &EACH
        MD 5
      &END EACH
    &END CELL
  &END PRINT
&END MOTION
```

This simulation was performed with previous version of cp2k (so without 
your fix). 
piątek, 25 października 2024 o 09:50:47 UTC+2 bartosz mazur napisał(a):

> Hi Frederick, 
>
> it helped with most of the tests! Now only 13 have failed. In the 
> attachments you will find full output from regtests and here is output from 
> single job with TRACE enabled:
>
> ```
> Loading intel/2024a
>   Loading requirement: GCCcore/13.3.0 zlib/1.3.1-GCCcore-13.3.0
>     binutils/2.42-GCCcore-13.3.0 intel-compilers/2024.2.0
>     numactl/2.0.18-GCCcore-13.3.0 UCX/1.16.0-GCCcore-13.3.0
>     impi/2021.13.0-intel-compilers-2024.2.0 imkl/2024.2.0 iimpi/2024a
>     imkl-FFTW/2024.2.0-iimpi-2024a
>
> Currently Loaded Modulefiles:
>  1) GCCcore/13.3.0                  7) 
> impi/2021.13.0-intel-compilers-2024.2.0  
>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                       
>      
>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                         
>      
>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a     
>       
>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                         
>      
>  6) UCX/1.16.0-GCCcore-13.3.0      
> 2 MPI processes with 2 OpenMP threads each
> started at Fri Oct 25 09:34:34 CEST 2024 in /lustre/tmp/slurm/3127182
> SIRIUS 7.6.1, git hash: 
> https://api.github.com/repos/electronic-structure/SIRIUS/git/ref/tags/v7.6.1
> Warning! Compiled in 'debug' mode with assert statements enabled!
>
>
> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
> CLX/DP      TRY    JIT    STA    COL
>    0..13      8      8      0      0 
>   14..23      0      0      0      0 
>   24..64      0      0      0      0 
> Registry and code: 13 MB + 64 KB (gemm=8)
> Command (PID=423503): 
> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
> dftd3src1.inp -o dftd3src1.out
> Uptime: 2.752513 s
>
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   RANK 0 PID 423503 RUNNING AT r21c01b03
>
> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>
> ===================================================================================
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   RANK 1 PID 423504 RUNNING AT r21c01b03
>
> =   KILLED BY SIGNAL: 9 (Killed)
>
> ===================================================================================
> finished at Fri Oct 25 09:34:39 CEST 2024
> ```
>
> and the last lines:
>
> ```
>  000000:000002<<                                  13      3 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                                  13      4 
> mp_sendrecv_dm2     
>    start Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                                  13      4 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                               12      2 pw_nn_compose_r   
>     0
>  .003 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                            11      1 xc_pw_derive       
> 0.003 H
>  ostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                            11      5 pw_zero       start 
> Hostme
>  m: 955 MB GPUmem: 0 MB
>  000000:000002<<                            11      5 pw_zero       0.000 
> Hostme
>  m: 955 MB GPUmem: 0 MB
>  000000:000002>>                            11      2 xc_pw_derive       
> start H
>  ostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                               12      3 pw_nn_compose_r   
>     s
>  tart Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                                  13      5 
> mp_sendrecv_dm2     
>    start Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                                  13      5 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                                  13      6 
> mp_sendrecv_dm2     
>    start Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                                  13      6 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                               12      3 pw_nn_compose_r   
>     0
>  .002 Hostmem: 955 MB GPUmem: 0 MB
>  000000:000002<<                            11      2 xc_pw_derive       
> 0.002 H
>  ostmem: 955 MB GPUmem: 0 MB
>  000000:000002>>                            11      6 pw_zero       start 
> Hostme
>  m: 955 MB GPUmem: 0 MB
>  000000:000002<<                            11      6 pw_zero       0.001 
> Hostme
>  m: 960 MB GPUmem: 0 MB
>  000000:000002>>                            11      3 xc_pw_derive       
> start H
>  ostmem: 960 MB GPUmem: 0 MB
>  000000:000002>>                               12      4 pw_nn_compose_r   
>     s
>  tart Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002>>                                  13      7 
> mp_sendrecv_dm2     
>    start Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002<<                                  13      7 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002>>                                  13      8 
> mp_sendrecv_dm2     
>    start Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002<<                                  13      8 
> mp_sendrecv_dm2     
>    0.000 Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002<<                               12      4 pw_nn_compose_r   
>     0
>  .002 Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002<<                            11      3 xc_pw_derive       
> 0.002 H
>  ostmem: 960 MB GPUmem: 0 MB
>  000000:000002>>                            11      1 
> pw_spline_scale_deriv     
>    start Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002<<                            11      1 
> pw_spline_scale_deriv     
>    0.001 Hostmem: 960 MB GPUmem: 0 MB
>  000000:000002>>                            11     20 pw_pool_give_back_pw 
>      
>   start Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002<<                            11     20 pw_pool_give_back_pw 
>      
>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002>>                            11     21 pw_pool_give_back_pw 
>      
>   start Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002<<                            11     21 pw_pool_give_back_pw 
>      
>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002>>                            11     22 pw_pool_give_back_pw 
>      
>   start Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002<<                            11     22 pw_pool_give_back_pw 
>      
>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002>>                            11     23 pw_pool_give_back_pw 
>      
>   start Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002<<                            11     23 pw_pool_give_back_pw 
>      
>   0.000 Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002>>                            11      1 xc_functional_eval   
>     s
>  tart Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002>>                               12      1 b97_lda_eval     
>   star
>  t Hostmem: 965 MB GPUmem: 0 MB
>  000000:000002<<                               12      1 b97_lda_eval     
>   0.10
>  3 Hostmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                            11      1 xc_functional_eval   
>     0
>  .103 Hostmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10      1 
> xc_rho_set_and_dset_create   
>      0.120 Hostmem: 979 MB GPUmem: 0 MB
>  000000:000002>>                         10      1 check_for_derivatives   
>     s
>  tart Hostmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10      1 check_for_derivatives   
>     0
>  .000 Hostmem: 979 MB GPUmem: 0 MB
>  000000:000002>>                         10     14 pw_create_r3d       
> start Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10     14 pw_create_r3d       
> 0.000 Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002>>                         10     15 pw_create_r3d       
> start Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10     15 pw_create_r3d       
> 0.000 Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002>>                         10     16 pw_create_r3d       
> start Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10     16 pw_create_r3d       
> 0.000 Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002>>                         10     17 pw_create_r3d       
> start Hos
>  tmem: 979 MB GPUmem: 0 MB
>  000000:000002<<                         10     17 pw_create_r3d       
> 0.000 Hos
>  tmem: 979 MB GPUmem: 0 MB
> ```
>
> Best
> Bartosz
>
> środa, 23 października 2024 o 09:15:33 UTC+2 Frederick Stein napisał(a):
>
>> Dear Bartosz,
>> My fix is merged. Can you switch to the CP2K master and try it again? We 
>> are still working on a few issues with the Intel compilers such that we may 
>> eventually migrate from ifort to ifx.
>> Best,
>> Frederick
>>
>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 17:45:21 UTC+2:
>>
>>> Great! Thank you for your help. 
>>>
>>> Best
>>> Bartosz
>>>
>>> wtorek, 22 października 2024 o 15:24:04 UTC+2 Frederick Stein napisał(a):
>>>
>>>> I have a fix for it. In contrast to my first thought, it is a case of 
>>>> invalid type conversion from real to complex numbers (yes, Fortran is 
>>>> rather strict about it) in pw_derive. This may also be present in a few 
>>>> other spots. I am currently running more tests and I will open a pull 
>>>> request within the next few days.
>>>> Best,
>>>> Frederick
>>>>
>>>> Frederick Stein schrieb am Dienstag, 22. Oktober 2024 um 13:12:49 UTC+2:
>>>>
>>>>> I can reproduce the error locally. I am investigating it now.
>>>>>
>>>>> bartosz mazur schrieb am Dienstag, 22. Oktober 2024 um 11:58:57 UTC+2:
>>>>>
>>>>>> I was loading it as it was needed for compilation. I have unloaded 
>>>>>> the module, but the error still occurs: 
>>>>>>
>>>>>> ```
>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>    0..13      2      2      0      0 
>>>>>>   14..23      0      0      0      0 
>>>>>>   24..64      0      0      0      0 
>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>> Command (PID=15485): 
>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>> H2O-9.inp -o H2O-9.out
>>>>>> Uptime: 1.757102 s
>>>>>>
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> =   RANK 0 PID 15485 RUNNING AT r30c01b01
>>>>>>
>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>
>>>>>> ===================================================================================
>>>>>>
>>>>>>
>>>>>> ===================================================================================
>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>> =   RANK 1 PID 15486 RUNNING AT r30c01b01
>>>>>>
>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>
>>>>>> ===================================================================================
>>>>>> ```
>>>>>>
>>>>>>
>>>>>> and the last 100 lines:
>>>>>>
>>>>>> ```
>>>>>>  000000:000002>>                            11     37 pw_create_c1d   
>>>>>>     start 
>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     37 pw_create_c1d   
>>>>>>     0.000 
>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                         10     64 pw_pool_create_pw 
>>>>>>       0.000
>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                         10     25 pw_copy       
>>>>>> start Hostmem: 
>>>>>>  697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                         10     25 pw_copy       
>>>>>> 0.001 Hostmem: 
>>>>>>  697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                         10     17 pw_axpy       
>>>>>> start Hostmem: 
>>>>>>  697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                         10     17 pw_axpy       
>>>>>> 0.001 Hostmem: 
>>>>>>  697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                         10     19 mp_sum_d       
>>>>>> start Hostmem:
>>>>>>   697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                         10     19 mp_sum_d       
>>>>>> 0.000 Hostmem:
>>>>>>   697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                         10      3 pw_poisson_solve   
>>>>>>     start 
>>>>>>  Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11      3 
>>>>>> pw_poisson_rebuild       s
>>>>>>  tart Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11      3 
>>>>>> pw_poisson_rebuild       0
>>>>>>  .000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     65 
>>>>>> pw_pool_create_pw       st
>>>>>>  art Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     38 
>>>>>> pw_create_c1d       sta
>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     38 
>>>>>> pw_create_c1d       0.0
>>>>>>  00 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     65 
>>>>>> pw_pool_create_pw       0.
>>>>>>  000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     26 pw_copy       
>>>>>> start Hostme
>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     26 pw_copy       
>>>>>> 0.001 Hostme
>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11      3 
>>>>>> pw_multiply_with       sta
>>>>>>  rt Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11      3 
>>>>>> pw_multiply_with       0.0
>>>>>>  01 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11     27 pw_copy       
>>>>>> start Hostme
>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11     27 pw_copy       
>>>>>> 0.001 Hostme
>>>>>>  m: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11      3 pw_integral_ab 
>>>>>>       start
>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     20 mp_sum_d     
>>>>>>   start Ho
>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     20 mp_sum_d     
>>>>>>   0.001 Ho
>>>>>>  stmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                            11      3 pw_integral_ab 
>>>>>>       0.004
>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                            11      4 pw_poisson_set 
>>>>>>       start
>>>>>>   Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     66 
>>>>>> pw_pool_create_pw      
>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     39 
>>>>>> pw_create_c1d       
>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     39 
>>>>>> pw_create_c1d       
>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     66 
>>>>>> pw_pool_create_pw      
>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     28 pw_copy     
>>>>>>   start Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     28 pw_copy     
>>>>>>   0.001 Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      7 pw_derive   
>>>>>>     start H
>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12      7 pw_derive   
>>>>>>     0.002 H
>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     67 
>>>>>> pw_pool_create_pw      
>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     40 
>>>>>> pw_create_c1d       
>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     40 
>>>>>> pw_create_c1d       
>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     67 
>>>>>> pw_pool_create_pw      
>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     29 pw_copy     
>>>>>>   start Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     29 pw_copy     
>>>>>>   0.001 Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      8 pw_derive   
>>>>>>     start H
>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12      8 pw_derive   
>>>>>>     0.002 H
>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     68 
>>>>>> pw_pool_create_pw      
>>>>>>   start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                                  13     41 
>>>>>> pw_create_c1d       
>>>>>>  start Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                                  13     41 
>>>>>> pw_create_c1d       
>>>>>>  0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     68 
>>>>>> pw_pool_create_pw      
>>>>>>   0.000 Hostmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12     30 pw_copy     
>>>>>>   start Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002<<                               12     30 pw_copy     
>>>>>>   0.001 Hos
>>>>>>  tmem: 697 MB GPUmem: 0 MB
>>>>>>  000000:000002>>                               12      9 pw_derive   
>>>>>>     start H
>>>>>>  ostmem: 697 MB GPUmem: 0 MB
>>>>>>  ```
>>>>>>
>>>>>> This is the list of currently loaded modules (all come with intel):
>>>>>>
>>>>>> ```
>>>>>> Currently Loaded Modulefiles:
>>>>>>  1) GCCcore/13.3.0                  7) 
>>>>>> impi/2021.13.0-intel-compilers-2024.2.0  
>>>>>>  2) zlib/1.3.1-GCCcore-13.3.0       8) imkl/2024.2.0                 
>>>>>>            
>>>>>>  3) binutils/2.42-GCCcore-13.3.0    9) iimpi/2024a                   
>>>>>>            
>>>>>>  4) intel-compilers/2024.2.0       10) imkl-FFTW/2024.2.0-iimpi-2024a 
>>>>>>           
>>>>>>  5) numactl/2.0.18-GCCcore-13.3.0  11) intel/2024a                   
>>>>>>            
>>>>>>  6) UCX/1.16.0-GCCcore-13.3.0    
>>>>>> ```
>>>>>> wtorek, 22 października 2024 o 11:12:57 UTC+2 Frederick Stein 
>>>>>> napisał(a):
>>>>>>
>>>>>>> Dear Bartosz,
>>>>>>> I am currently running some tests with the latest Intel compiler 
>>>>>>> myself. What bothers me about your setup is the module GCC13/13.3.0 . Why 
>>>>>>> is it loaded? Can you unload it? This would at least reduce potential 
>>>>>>> interferences with between the Intel and the GCC compilers.
>>>>>>> Best,
>>>>>>> Frederick
>>>>>>>
>>>>>>> bartosz mazur schrieb am Montag, 21. Oktober 2024 um 16:33:45 UTC+2:
>>>>>>>
>>>>>>>> The error for ssmp is:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>    0..13      4      4      0      0 
>>>>>>>>   14..23      0      0      0      0 
>>>>>>>>   24..64      0      0      0      0 
>>>>>>>> Registry and code: 13 MB + 32 KB (gemm=4)
>>>>>>>> Command (PID=54845): 
>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>> Uptime: 2.861583 s
>>>>>>>> /var/spool/slurmd/r30c01b15/job3120330/slurm_script: line 36: 54845 
>>>>>>>> Segmentation fault      (core dumped) 
>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.ssmp -i 
>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>> ```
>>>>>>>>
>>>>>>>> and the last 100 lines of output:
>>>>>>>>
>>>>>>>> ```
>>>>>>>>  000000:000001>>                               12     20 mp_sum_d   
>>>>>>>>     start Ho
>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12     20 mp_sum_d   
>>>>>>>>     0.000 Ho
>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                            11     13 dbcsr_dot_sd 
>>>>>>>>       0.000 H
>>>>>>>>  ostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                         10     12 
>>>>>>>> calculate_ptrace_kp       0.0
>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                       9      6 
>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                       9      6 rebuild_ks_matrix   
>>>>>>>>     start Ho
>>>>>>>>  stmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                         10      6 
>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>        start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                            11    140 
>>>>>>>> pw_pool_create_pw       st
>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12     79 
>>>>>>>> pw_create_c1d       sta
>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12     79 
>>>>>>>> pw_create_c1d       0.0
>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                            11    140 
>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                            11    141 
>>>>>>>> pw_pool_create_pw       st
>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12     80 
>>>>>>>> pw_create_c1d       sta
>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12     80 
>>>>>>>> pw_create_c1d       0.0
>>>>>>>>  00 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                            11    141 
>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>  000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                            11     61 pw_copy       
>>>>>>>> start Hostme
>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                            11     61 pw_copy       
>>>>>>>> 0.004 Hostme
>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                            11     35 pw_axpy       
>>>>>>>> start Hostme
>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                            11     35 pw_axpy       
>>>>>>>> 0.002 Hostme
>>>>>>>>  m: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                            11      6 
>>>>>>>> pw_poisson_solve       sta
>>>>>>>>  rt Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>> pw_poisson_rebuild     
>>>>>>>>    start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>> pw_poisson_rebuild     
>>>>>>>>    0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12    142 
>>>>>>>> pw_pool_create_pw      
>>>>>>>>   start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13     81 
>>>>>>>> pw_create_c1d       
>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13     81 
>>>>>>>> pw_create_c1d       
>>>>>>>>  0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12    142 
>>>>>>>> pw_pool_create_pw      
>>>>>>>>   0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12     62 pw_copy   
>>>>>>>>     start Hos
>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12     62 pw_copy   
>>>>>>>>     0.003 Hos
>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>> pw_multiply_with       
>>>>>>>>  start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>> pw_multiply_with       
>>>>>>>>  0.002 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12     63 pw_copy   
>>>>>>>>     start Hos
>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12     63 pw_copy   
>>>>>>>>     0.003 Hos
>>>>>>>>  tmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12      6 
>>>>>>>> pw_integral_ab       st
>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                               12      6 
>>>>>>>> pw_integral_ab       0.
>>>>>>>>  005 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                               12      7 
>>>>>>>> pw_poisson_set       st
>>>>>>>>  art Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13    143 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                     14     82 
>>>>>>>> pw_create_c1d    
>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                     14     82 
>>>>>>>> pw_create_c1d    
>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13    143 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13     64 pw_copy 
>>>>>>>>       start 
>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13     64 pw_copy 
>>>>>>>>       0.003 
>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13     16 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13     16 
>>>>>>>> pw_derive       0.00
>>>>>>>>  6 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13    144 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                     14     83 
>>>>>>>> pw_create_c1d    
>>>>>>>>     start Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                     14     83 
>>>>>>>> pw_create_c1d    
>>>>>>>>     0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13    144 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      0.000 Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13     65 pw_copy 
>>>>>>>>       start 
>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001<<                                  13     65 pw_copy 
>>>>>>>>       0.004 
>>>>>>>>  Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>>  000000:000001>>                                  13     17 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 380 MB GPUmem: 0 MB
>>>>>>>> ```
>>>>>>>>
>>>>>>>> for psmp the last 100 lines is:
>>>>>>>>
>>>>>>>> ```
>>>>>>>>  000000:000002<<                       9      7 
>>>>>>>> evaluate_core_matrix_traces     
>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                       9      7 rebuild_ks_matrix   
>>>>>>>>     start Ho
>>>>>>>>
>>>>>>>>  stmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                         10      7 
>>>>>>>> qs_ks_build_kohn_sham_matrix 
>>>>>>>>        start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11    164 
>>>>>>>> pw_pool_create_pw       st
>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12     93 
>>>>>>>> pw_create_c1d       sta
>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12     93 
>>>>>>>> pw_create_c1d       0.0
>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                            11    164 
>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11    165 
>>>>>>>> pw_pool_create_pw       st
>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12     94 
>>>>>>>> pw_create_c1d       sta
>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12     94 
>>>>>>>> pw_create_c1d       0.0
>>>>>>>>  00 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                            11    165 
>>>>>>>> pw_pool_create_pw       0.
>>>>>>>>  000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11     73 pw_copy       
>>>>>>>> start Hostme
>>>>>>>>
>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                            11     73 pw_copy       
>>>>>>>> 0.001 Hostme
>>>>>>>>
>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11     41 pw_axpy       
>>>>>>>> start Hostme
>>>>>>>>
>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                            11     41 pw_axpy       
>>>>>>>> 0.001 Hostme
>>>>>>>>
>>>>>>>>  m: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11     52 mp_sum_d     
>>>>>>>>   start Hostm
>>>>>>>>
>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                            11     52 mp_sum_d     
>>>>>>>>   0.000 Hostm
>>>>>>>>
>>>>>>>>  em: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                            11      7 
>>>>>>>> pw_poisson_solve       sta
>>>>>>>>  rt Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>> pw_poisson_rebuild     
>>>>>>>>    start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>> pw_poisson_rebuild     
>>>>>>>>    0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12    166 
>>>>>>>> pw_pool_create_pw      
>>>>>>>>
>>>>>>>>   start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     95 
>>>>>>>> pw_create_c1d       
>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     95 
>>>>>>>> pw_create_c1d       
>>>>>>>>  0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12    166 
>>>>>>>> pw_pool_create_pw      
>>>>>>>>
>>>>>>>>   0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12     74 pw_copy   
>>>>>>>>     start Hos
>>>>>>>>
>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12     74 pw_copy   
>>>>>>>>     0.001 Hos
>>>>>>>>
>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>> pw_multiply_with       
>>>>>>>>  start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>> pw_multiply_with       
>>>>>>>>  0.001 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12     75 pw_copy   
>>>>>>>>     start Hos
>>>>>>>>
>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12     75 pw_copy   
>>>>>>>>     0.001 Hos
>>>>>>>>
>>>>>>>>  tmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12      7 
>>>>>>>> pw_integral_ab       st
>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     53 
>>>>>>>> mp_sum_d       start
>>>>>>>>
>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     53 
>>>>>>>> mp_sum_d       0.000
>>>>>>>>
>>>>>>>>   Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                               12      7 
>>>>>>>> pw_integral_ab       0.
>>>>>>>>  003 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                               12      8 
>>>>>>>> pw_poisson_set       st
>>>>>>>>  art Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13    167 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                     14     96 
>>>>>>>> pw_create_c1d    
>>>>>>>>
>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                     14     96 
>>>>>>>> pw_create_c1d    
>>>>>>>>
>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13    167 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     76 pw_copy 
>>>>>>>>       start 
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     76 pw_copy 
>>>>>>>>       0.001 
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>> pw_derive       0.00
>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>> pw_create_c1d    
>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>> pw_create_c1d    
>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>> pw_pool_create_pw   
>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     77 pw_copy 
>>>>>>>>       start 
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002<<                                  13     77 pw_copy 
>>>>>>>>       0.001 
>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>> pw_derive       star
>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>> ```
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Bartosz
>>>>>>>>
>>>>>>>> poniedziałek, 21 października 2024 o 08:58:34 UTC+2 Frederick Stein 
>>>>>>>> napisał(a):
>>>>>>>>
>>>>>>>>> Dear Bartosz,
>>>>>>>>> I have no idea about the issue with LibXSMM.
>>>>>>>>> Regarding the trace, I do not know either as there is not much 
>>>>>>>>> that could break in pw_derive (it just performs multiplications) and the 
>>>>>>>>> sequence of operations is to unspecific. It may be that the code actually 
>>>>>>>>> breaks somewhere else. Can you do the same with the ssmp and post the last 
>>>>>>>>> 100 lines? This way, we remove the asynchronicity issues for backtraces 
>>>>>>>>> with the psmp version.
>>>>>>>>> Best,
>>>>>>>>> Frederick
>>>>>>>>>
>>>>>>>>> bartosz mazur schrieb am Sonntag, 20. Oktober 2024 um 16:47:15 
>>>>>>>>> UTC+2:
>>>>>>>>>
>>>>>>>>>> The error is:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> LIBXSMM_VERSION: develop-1.17-3834 (25693946)
>>>>>>>>>> CLX/DP      TRY    JIT    STA    COL
>>>>>>>>>>    0..13      2      2      0      0
>>>>>>>>>>   14..23      0      0      0      0
>>>>>>>>>>
>>>>>>>>>>   24..64      0      0      0      0
>>>>>>>>>> Registry and code: 13 MB + 16 KB (gemm=2)
>>>>>>>>>> Command (PID=2607388): 
>>>>>>>>>> /lustre/pd01/hpc-kuchta-1716987452/software/cp2k/exe/local/cp2k.psmp -i 
>>>>>>>>>> H2O-9.inp -o H2O-9.out
>>>>>>>>>> Uptime: 5.288243 s
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> =   RANK 0 PID 2607388 RUNNING AT r21c01b10
>>>>>>>>>>
>>>>>>>>>> =   KILLED BY SIGNAL: 11 (Segmentation fault)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>> =   RANK 1 PID 2607389 RUNNING AT r21c01b10
>>>>>>>>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>>>>>>>>
>>>>>>>>>> ===================================================================================
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> and the last 20 lines:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>>  000000:000002<<                                  13     76 
>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     19 
>>>>>>>>>> pw_derive       star
>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13     19 
>>>>>>>>>> pw_derive       0.00
>>>>>>>>>>  2 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13    168 
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>      start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                     14     97 
>>>>>>>>>> pw_create_c1d
>>>>>>>>>>     start Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                     14     97 
>>>>>>>>>> pw_create_c1d
>>>>>>>>>>     0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13    168 
>>>>>>>>>> pw_pool_create_pw
>>>>>>>>>>      0.000 Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     77 
>>>>>>>>>> pw_copy       start
>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002<<                                  13     77 
>>>>>>>>>> pw_copy       0.001
>>>>>>>>>>  Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>>  000000:000002>>                                  13     20 
>>>>>>>>>> pw_derive       star
>>>>>>>>>>  t Hostmem: 693 MB GPUmem: 0 MB
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> piątek, 18 października 2024 o 17:18:39 UTC+2 Frederick Stein 
>>>>>>>>>> napisał(a):
>>>>>>>>>>
>>>>>>>>>>> Please pick one of the failing tests. Then, add the TRACE 
>>>>>>>>>>> keyword to the &GLOBAL section and then run the test manually. This 
>>>>>>>>>>> increases the size of the output file dramatically (to some million lines). 
>>>>>>>>>>> Can you send me the last ~20 lines of the output?
>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 17:09:40 
>>>>>>>>>>> UTC+2:
>>>>>>>>>>>
>>>>>>>>>>>> I'm using do_regtests.py script, not make regtesting, but I 
>>>>>>>>>>>> assume it makes no difference. As I mentioned in previous message for 
>>>>>>>>>>>> `--ompthreads 1` all tests were passed both for ssmp and psmp. For ssmp 
>>>>>>>>>>>> with `--ompthreads 2` I observe similar errors as for psmp with the same 
>>>>>>>>>>>> setting, I provide example output as attachment. 
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>
>>>>>>>>>>>> piątek, 18 października 2024 o 16:24:16 UTC+2 Frederick Stein 
>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>> What happens if you set the number of OpenMP threads to 1 (add 
>>>>>>>>>>>>> '--ompthreads 1' to TESTOPTS)? What errors do you observe in case of the 
>>>>>>>>>>>>> ssmp?
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>
>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 18. Oktober 2024 um 15:37:43 
>>>>>>>>>>>>> UTC+2:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Frederick,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks again for help. So I have tested different simulation 
>>>>>>>>>>>>>> variants and I know that the problem occurs when using OMP. For MPI 
>>>>>>>>>>>>>> calculations without OMP all tests pass. I have also tested the effect of 
>>>>>>>>>>>>>> the `OMP_PROC_BIND` and `OMP_PLACES` parameters and apart 
>>>>>>>>>>>>>> from the effect on simulation time, they have no significant effect on the 
>>>>>>>>>>>>>> presence of errors. Below are the results for ssmp:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, correct, total, wrong, failed, 
>>>>>>>>>>>>>> time 
>>>>>>>>>>>>>> spread, threads, 3850, 4144, 4, 290, 186min
>>>>>>>>>>>>>> spread, cores, 3831, 4144, 3, 310, 183min
>>>>>>>>>>>>>> spread, sockets, 3864, 4144, 3, 277, 104min
>>>>>>>>>>>>>> close, threads, 3879, 4144, 3, 262, 171min
>>>>>>>>>>>>>> close, cores, 3854, 4144, 0, 290, 168min
>>>>>>>>>>>>>> close, sockets, 3865, 4144, 3, 276, 104min
>>>>>>>>>>>>>> master, threads, 4121, 4144, 0, 23, 1002min
>>>>>>>>>>>>>> master, cores, 4121, 4144, 0, 23, 986min
>>>>>>>>>>>>>> master, sockets, 3942, 4144, 3, 199, 219min
>>>>>>>>>>>>>> false, threads, 3918, 4144, 0, 226, 178min
>>>>>>>>>>>>>> false, cores, 3919, 4144, 3, 222, 176min
>>>>>>>>>>>>>> false, sockets, 3856, 4144, 4, 284, 104min
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and psmp:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> OMP_PROC_BIND, OMP_PLACES, results
>>>>>>>>>>>>>> spread, threads, Summary: correct: 4097 / 4227; failed: 130; 
>>>>>>>>>>>>>> 495min
>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>> spread, cores, 26 / 362
>>>>>>>>>>>>>> close, threads, Summary: correct: 4133 / 4227; failed: 94; 
>>>>>>>>>>>>>> 484min
>>>>>>>>>>>>>> close, cores, 60 / 362
>>>>>>>>>>>>>> close, sockets, 13 / 362
>>>>>>>>>>>>>> master, threads, 13 / 362
>>>>>>>>>>>>>> master, cores, 79 / 362
>>>>>>>>>>>>>> master, sockets, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>>>> 563min
>>>>>>>>>>>>>> false, threads, Summary: correct: 4153 / 4227; failed: 74; 
>>>>>>>>>>>>>> 556min
>>>>>>>>>>>>>> false, cores, Summary: correct: 4106 / 4227; failed: 121; 
>>>>>>>>>>>>>> 511min
>>>>>>>>>>>>>> false, sockets, 96 / 362
>>>>>>>>>>>>>> not specified, not specified, Summary: correct: 4129 / 4227; 
>>>>>>>>>>>>>> failed: 98; 263min
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas what I could do next to have more information about 
>>>>>>>>>>>>>> the source of the problem or maybe you see a potential solution at this 
>>>>>>>>>>>>>> stage? I would appreciate any further help. 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>> Bartosz
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> piątek, 11 października 2024 o 14:30:25 UTC+2 Frederick Stein 
>>>>>>>>>>>>>> napisał(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear Bartosz,
>>>>>>>>>>>>>>> If I am not mistaken, you used 8 OpenMP threads. The test do 
>>>>>>>>>>>>>>> not run that efficiently with such a large number of threads. 2 should be 
>>>>>>>>>>>>>>> sufficient.
>>>>>>>>>>>>>>> The test result suggests that most of the functionality may 
>>>>>>>>>>>>>>> work but due to a missing backtrace (or similar information), it is hard to 
>>>>>>>>>>>>>>> tell why they fail. You could also try to run some of the single-node tests 
>>>>>>>>>>>>>>> to assess the stability of CP2K.
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Frederick
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> bartosz mazur schrieb am Freitag, 11. Oktober 2024 um 
>>>>>>>>>>>>>>> 13:48:42 UTC+2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sorry, forgot attachments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/7042b62f-62de-43ad-ad94-b940977c9e2an%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20241025/e870b8d5/attachment-0001.htm>


More information about the CP2K-user mailing list