DFT-D3 Timing

Mike Ruggiero miker... at gmail.com
Fri Nov 18 21:22:36 CET 2016


Hello everyone,

I seem to have stumbled on something. I noticed some funny scaling (or 
rather, a lack of scaling) when performing AIMD simulations with DFT-D3. 
I've tracked the problem to the calculation of the C9 term, when I have 
that set to be calculated the simulation doesn't scale at all and it takes 
a long time versus when I have it set to not be calculated, in which case I 
do get linear scaling. Below is a timing report for the same exact INPUT 
file, with the only difference being the calculation of the C9 term being 
set to T and F, respectively. Something interesting is how different they 
are. Again, these are from the exact same INPUT file, with the exact same 
number of processors, and the only difference is the CALCULATE_C9_TERM  is 
set to T in the first case and F in the second. 

timing with C9 calculation:

 -------------------------------------------------------------------------------
 -                                                                         
    -
 -                                T I M I N G                               
   -
 -                                                                         
    -
 -------------------------------------------------------------------------------
 SUBROUTINE                       CALLS  ASD         SELF TIME        TOTAL 
TIME
                                MAXIMUM       AVERAGE  MAXIMUM  AVERAGE 
 MAXIMUM
 CP2K                                 1  1.0    0.008    0.019  772.815 
 772.821
 qs_mol_dyn_low                       1  2.0    0.002    0.003  772.503 
 772.517
 qs_forces                           11  3.9    0.002    0.002  771.252 
 771.258
 qs_energies                         11  4.9    0.001    0.001  765.300 
 765.307
 velocity_verlet                     10  3.0    0.002    0.002  686.306 
 686.323
 qs_energies_init_hamiltonians       11  5.9    0.000    0.000  686.298 
 686.303
 mp_sum_d                          4362 11.6  619.789  685.783  619.789 
 685.783
 calculate_dispersion_pairpot        11  6.9   65.911  685.372  685.486 
 685.491
 scf_env_do_scf                      11  5.9    0.000    0.001   76.654   
76.655
 scf_env_do_scf_inner_loop          139  6.7    0.004    0.030   71.107   
71.468
 rebuild_ks_matrix                  150  8.5    0.000    0.001   52.308   
52.341
 qs_ks_build_kohn_sham_matrix       150  9.5    0.017    0.021   52.307   
52.340
 qs_ks_update_qs_env                151  7.7    0.001    0.001   47.661   
47.693
 pw_transfer                       4061 12.2    0.239    0.267   35.026   
35.154
 fft_wrap_pw1pw2                   3761 13.2    0.028    0.031   34.447   
34.575
 fft_wrap_pw1pw2_350               1661 14.3    2.499    2.594   32.640   
32.789
 qs_vxc_create                      150 10.5    0.003    0.004   30.531   
30.553
 xc_vxc_pw_create                   150 11.5    1.027    1.065   30.528   
30.551
 fft3d_ps                          3761 15.2   12.507   12.766   28.236   
28.478
 qs_rho_update_rho                  150  7.8    0.001    0.001   18.349   
18.356
 calculate_rho_elec                 150  8.8    3.941    4.247   18.348   
18.355
 xc_rho_set_and_dset_create         150 12.5    0.188    0.193   16.380   
16.402
 sum_up_and_integrate               150 10.5    0.079    0.082   15.635   
15.650
 integrate_v_rspace                 150 11.5    4.076    4.275   15.556   
15.572
 -------------------------------------------------------------------------------

Without c9 calculation 

 -------------------------------------------------------------------------------
 -                                                                         
    -
 -                                T I M I N G                               
   -
 -                                                                         
    -
 -------------------------------------------------------------------------------
 SUBROUTINE                       CALLS  ASD         SELF TIME        TOTAL 
TIME
                                MAXIMUM       AVERAGE  MAXIMUM  AVERAGE 
 MAXIMUM
 CP2K                                 1  1.0    0.010    0.023   98.369   
98.371
 qs_mol_dyn_low                       1  2.0    0.002    0.003   98.009   
98.019
 qs_forces                           11  3.9    0.002    0.002   97.695   
97.697
 qs_energies                         11  4.9    0.001    0.001   91.392   
91.395
 scf_env_do_scf                      11  5.9    0.000    0.001   84.901   
84.903
 scf_env_do_scf_inner_loop          139  6.7    0.004    0.028   78.845   
79.075
 velocity_verlet                     10  3.0    0.002    0.002   67.867   
67.871
 rebuild_ks_matrix                  150  8.5    0.000    0.001   55.485   
55.544
 qs_ks_build_kohn_sham_matrix       150  9.5    0.017    0.020   55.484   
55.543
 qs_ks_update_qs_env                151  7.7    0.001    0.001   50.557   
50.612
 pw_transfer                       3461 12.2    0.206    0.259   37.722   
37.838
 fft_wrap_pw1pw2                   3161 13.2    0.027    0.033   37.180   
37.297
 fft_wrap_pw1pw2_350               1661 14.3    2.439    2.559   34.568   
34.695
 qs_vxc_create                      150 10.5    0.003    0.004   31.774   
31.801
 xc_vxc_pw_create                   150 11.5    1.015    1.048   31.771   
31.798
 fft3d_ps                          3161 15.2   12.404   12.688   31.074   
31.311
 qs_rho_update_rho                  150  7.8    0.001    0.001   20.139   
20.150
 calculate_rho_elec                 150  8.8    3.909    4.230   20.138   
20.149
 sum_up_and_integrate               150 10.5    0.077    0.081   17.172   
17.191
 integrate_v_rspace                 150 11.5    4.070    4.271   17.094   
17.111
 xc_rho_set_and_dset_create         150 12.5    0.185    0.190   16.948   
16.973
 rs_pw_transfer                    1822 12.1    0.020    0.024   15.129   
15.437
 mp_alltoall_z22v                  3161 17.2   15.000   15.294   15.000   
15.294
 density_rs2pw                      150  9.8    0.007    0.009   14.718   
14.977
 qs_scf_new_mos                     139  7.7    0.001    0.001   14.120   
14.173
 qs_scf_loop_do_ot                  139  8.7    0.001    0.001   14.120   
14.173
 dbcsr_multiply_generic            2770 12.6    0.073    0.082   13.810   
13.940
 ot_scf_mini                        139  9.7    0.005    0.006   13.225   
13.279
 potential_pw2rs                    150 12.5    0.010    0.013   12.324   
12.362
 x_to_yz                           1050 16.5    1.546    1.618    9.447   
 9.550
 multiply_cannon                   2770 13.6    0.293    0.307    8.197   
 8.790
 yz_to_x                            911 16.0    1.073    1.104    8.140   
 8.355
 mp_waitall_1                    427928 15.4    7.704    8.344    7.704   
 8.344
 ot_mini                            139 10.7    0.001    0.001    8.062   
 8.132
 qs_ot_get_derivative               139 11.7    0.002    0.002    6.714   
 6.768
 xc_functional_eval                 150 13.5    0.002    0.004    6.023   
 6.043
 pbe_lda_eval                       150 14.5    6.021    6.041    6.021   
 6.041
 init_scf_loop                       12  6.8    0.000    0.000    5.780   
 5.780
 qs_ks_update_qs_env_forces          11  4.9    0.000    0.000    4.981   
 4.985
 rs_pw_transfer_PW2RS_350           161 14.2    2.271    2.444    4.256   
 4.435
 mp_waitany                       18660 14.0    3.701    4.433    3.701   
 4.433
 make_m2s                          5540 13.6    0.087    0.097    4.247   
 4.358
 rs_pw_transfer_RS2PW_350           161 11.6    2.137    2.399    3.706   
 3.947
 make_images                       5540 14.6    0.322    0.341    3.720   
 3.811
 qs_ot_get_derivative_taylor        110 12.9    0.003    0.004    3.646   
 3.691
 qs_energies_init_hamiltonians       11  5.9    0.000    0.000    3.259   
 3.259
 multiply_cannon_multrec          33240 14.6    2.695    3.049    2.705   
 3.060
 qs_ot_get_p                        151 10.5    0.001    0.001    2.871   
 2.945
 init_scf_run                        11  5.9    0.000    0.001    2.925   
 2.925
 scf_env_initial_rho_setup           11  6.9    0.000    0.000    2.925   
 2.925
 multiply_cannon_metrocomm1       33240 14.6    0.049    0.053    2.285   
 2.914
 multiply_cannon_metrocomm3       33240 14.6    0.040    0.045    0.735   
 2.682
 mp_sum_d                          4362 11.6    2.357    2.664    2.357   
 2.664
 wfi_extrapolate                     11  7.9    0.000    0.001    2.650   
 2.650
 calculate_dispersion_pairpot        11  6.9    0.219    2.076    2.327   
 2.327
 make_images_sizes                 5540 15.6    0.006    0.007    1.394   
 1.992
 mp_alltoall_i44                   5540 16.6    1.388    1.986    1.388   
 1.986
 -------------------------------------------------------------------------------



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20161118/d7850540/attachment.html>


More information about the CP2K-user mailing list