<div dir="ltr">Hello everyone,<div><br></div><div>I seem to have stumbled on something. I noticed some funny scaling (or rather, a lack of scaling) when performing AIMD simulations with DFT-D3. I've tracked the problem to the calculation of the C9 term, when I have that set to be calculated the simulation doesn't scale at all and it takes a long time versus when I have it set to not be calculated, in which case I do get linear scaling. Below is a timing report for the same exact INPUT file, with the only difference being the calculation of the C9 term being set to T and F, respectively. Something interesting is how different they are. Again, these are from the exact same INPUT file, with the exact same number of processors, and the only difference is the CALCULATE_C9_TERM  is set to T in the first case and F in the second. </div><div><br></div><div>timing with C9 calculation:</div><div><br></div><div><div><font face="courier new, monospace"> -------------------------------------------------------------------------------</font></div><div><font face="courier new, monospace"> -                                       -</font></div><div><font face="courier new, monospace"> -                 T I M I N G                  -</font></div><div><font face="courier new, monospace"> -                                       -</font></div><div><font face="courier new, monospace"> -------------------------------------------------------------------------------</font></div><div><font face="courier new, monospace"> SUBROUTINE            CALLS  ASD     SELF TIME     TOTAL TIME</font></div><div><font face="courier new, monospace">                MAXIMUM    AVERAGE  MAXIMUM  AVERAGE  MAXIMUM</font></div><div><font face="courier new, monospace"> CP2K                 1  1.0   0.008   0.019  772.815  772.821</font></div><div><font face="courier new, monospace"> qs_mol_dyn_low            1  2.0   0.002   0.003  772.503  772.517</font></div><div><font face="courier new, monospace"> qs_forces              11  3.9   0.002   0.002  771.252  771.258</font></div><div><font face="courier new, monospace"> qs_energies             11  4.9   0.001   0.001  765.300  765.307</font></div><div><font face="courier new, monospace"> velocity_verlet           10  3.0   0.002   0.002  686.306  686.323</font></div><div><font face="courier new, monospace"> qs_energies_init_hamiltonians    11  5.9   0.000   0.000  686.298  686.303</font></div><div><font face="courier new, monospace"> mp_sum_d              4362 11.6  619.789  685.783  619.789  685.783</font></div><div><font face="courier new, monospace"> calculate_dispersion_pairpot     11  6.9  65.911  685.372  685.486  685.491</font></div><div><font face="courier new, monospace"> scf_env_do_scf            11  5.9   0.000   0.001  76.654  76.655</font></div><div><font face="courier new, monospace"> scf_env_do_scf_inner_loop      139  6.7   0.004   0.030  71.107  71.468</font></div><div><font face="courier new, monospace"> rebuild_ks_matrix          150  8.5   0.000   0.001  52.308  52.341</font></div><div><font face="courier new, monospace"> qs_ks_build_kohn_sham_matrix    150  9.5   0.017   0.021  52.307  52.340</font></div><div><font face="courier new, monospace"> qs_ks_update_qs_env         151  7.7   0.001   0.001  47.661  47.693</font></div><div><font face="courier new, monospace"> pw_transfer            4061 12.2   0.239   0.267  35.026  35.154</font></div><div><font face="courier new, monospace"> fft_wrap_pw1pw2          3761 13.2   0.028   0.031  34.447  34.575</font></div><div><font face="courier new, monospace"> fft_wrap_pw1pw2_350        1661 14.3   2.499   2.594  32.640  32.789</font></div><div><font face="courier new, monospace"> qs_vxc_create            150 10.5   0.003   0.004  30.531  30.553</font></div><div><font face="courier new, monospace"> xc_vxc_pw_create          150 11.5   1.027   1.065  30.528  30.551</font></div><div><font face="courier new, monospace"> fft3d_ps              3761 15.2  12.507  12.766  28.236  28.478</font></div><div><font face="courier new, monospace"> qs_rho_update_rho          150  7.8   0.001   0.001  18.349  18.356</font></div><div><font face="courier new, monospace"> calculate_rho_elec         150  8.8   3.941   4.247  18.348  18.355</font></div><div><font face="courier new, monospace"> xc_rho_set_and_dset_create     150 12.5   0.188   0.193  16.380  16.402</font></div><div><font face="courier new, monospace"> sum_up_and_integrate        150 10.5   0.079   0.082  15.635  15.650</font></div><div><font face="courier new, monospace"> integrate_v_rspace         150 11.5   4.076   4.275  15.556  15.572</font></div><div><font face="courier new, monospace"> -------------------------------------------------------------------------------</font></div></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">Without c9 calculation </font></div><div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace"><div> -------------------------------------------------------------------------------</div><div> -                                       -</div><div> -                 T I M I N G                  -</div><div> -                                       -</div><div> -------------------------------------------------------------------------------</div><div> SUBROUTINE            CALLS  ASD     SELF TIME     TOTAL TIME</div><div>                MAXIMUM    AVERAGE  MAXIMUM  AVERAGE  MAXIMUM</div><div> CP2K                 1  1.0   0.010   0.023  98.369  98.371</div><div> qs_mol_dyn_low            1  2.0   0.002   0.003  98.009  98.019</div><div> qs_forces              11  3.9   0.002   0.002  97.695  97.697</div><div> qs_energies             11  4.9   0.001   0.001  91.392  91.395</div><div> scf_env_do_scf            11  5.9   0.000   0.001  84.901  84.903</div><div> scf_env_do_scf_inner_loop      139  6.7   0.004   0.028  78.845  79.075</div><div> velocity_verlet           10  3.0   0.002   0.002  67.867  67.871</div><div> rebuild_ks_matrix          150  8.5   0.000   0.001  55.485  55.544</div><div> qs_ks_build_kohn_sham_matrix    150  9.5   0.017   0.020  55.484  55.543</div><div> qs_ks_update_qs_env         151  7.7   0.001   0.001  50.557  50.612</div><div> pw_transfer            3461 12.2   0.206   0.259  37.722  37.838</div><div> fft_wrap_pw1pw2          3161 13.2   0.027   0.033  37.180  37.297</div><div> fft_wrap_pw1pw2_350        1661 14.3   2.439   2.559  34.568  34.695</div><div> qs_vxc_create            150 10.5   0.003   0.004  31.774  31.801</div><div> xc_vxc_pw_create          150 11.5   1.015   1.048  31.771  31.798</div><div> fft3d_ps              3161 15.2  12.404  12.688  31.074  31.311</div><div> qs_rho_update_rho          150  7.8   0.001   0.001  20.139  20.150</div><div> calculate_rho_elec         150  8.8   3.909   4.230  20.138  20.149</div><div> sum_up_and_integrate        150 10.5   0.077   0.081  17.172  17.191</div><div> integrate_v_rspace         150 11.5   4.070   4.271  17.094  17.111</div><div> xc_rho_set_and_dset_create     150 12.5   0.185   0.190  16.948  16.973</div><div> rs_pw_transfer           1822 12.1   0.020   0.024  15.129  15.437</div><div> mp_alltoall_z22v          3161 17.2  15.000  15.294  15.000  15.294</div><div> density_rs2pw            150  9.8   0.007   0.009  14.718  14.977</div><div> qs_scf_new_mos           139  7.7   0.001   0.001  14.120  14.173</div><div> qs_scf_loop_do_ot          139  8.7   0.001   0.001  14.120  14.173</div><div> dbcsr_multiply_generic       2770 12.6   0.073   0.082  13.810  13.940</div><div> ot_scf_mini             139  9.7   0.005   0.006  13.225  13.279</div><div> potential_pw2rs           150 12.5   0.010   0.013  12.324  12.362</div><div> x_to_yz              1050 16.5   1.546   1.618   9.447   9.550</div><div> multiply_cannon          2770 13.6   0.293   0.307   8.197   8.790</div><div> yz_to_x               911 16.0   1.073   1.104   8.140   8.355</div><div> mp_waitall_1           427928 15.4   7.704   8.344   7.704   8.344</div><div> ot_mini               139 10.7   0.001   0.001   8.062   8.132</div><div> qs_ot_get_derivative        139 11.7   0.002   0.002   6.714   6.768</div><div> xc_functional_eval         150 13.5   0.002   0.004   6.023   6.043</div><div> pbe_lda_eval            150 14.5   6.021   6.041   6.021   6.041</div><div> init_scf_loop            12  6.8   0.000   0.000   5.780   5.780</div><div> qs_ks_update_qs_env_forces      11  4.9   0.000   0.000   4.981   4.985</div><div> rs_pw_transfer_PW2RS_350      161 14.2   2.271   2.444   4.256   4.435</div><div> mp_waitany            18660 14.0   3.701   4.433   3.701   4.433</div><div> make_m2s              5540 13.6   0.087   0.097   4.247   4.358</div><div> rs_pw_transfer_RS2PW_350      161 11.6   2.137   2.399   3.706   3.947</div><div> make_images            5540 14.6   0.322   0.341   3.720   3.811</div><div> qs_ot_get_derivative_taylor     110 12.9   0.003   0.004   3.646   3.691</div><div> qs_energies_init_hamiltonians    11  5.9   0.000   0.000   3.259   3.259</div><div> multiply_cannon_multrec      33240 14.6   2.695   3.049   2.705   3.060</div><div> qs_ot_get_p             151 10.5   0.001   0.001   2.871   2.945</div><div> init_scf_run             11  5.9   0.000   0.001   2.925   2.925</div><div> scf_env_initial_rho_setup      11  6.9   0.000   0.000   2.925   2.925</div><div><div> multiply_cannon_metrocomm1    33240 14.6   0.049   0.053   2.285   2.914</div><div> multiply_cannon_metrocomm3    33240 14.6   0.040   0.045   0.735   2.682</div><div> mp_sum_d              4362 11.6   2.357   2.664   2.357   2.664</div><div> wfi_extrapolate           11  7.9   0.000   0.001   2.650   2.650</div><div> calculate_dispersion_pairpot     11  6.9   0.219   2.076   2.327   2.327</div><div> make_images_sizes         5540 15.6   0.006   0.007   1.394   1.992</div><div> mp_alltoall_i44          5540 16.6   1.388   1.986   1.388   1.986</div><div> -------------------------------------------------------------------------------</div></div><div><br></div></font></div><div><font face="courier new, monospace"><br></font></div><div><br></div></div>