DFT-D3 Timing
Mike Ruggiero
miker... at gmail.com
Fri Nov 18 20:22:36 UTC 2016
Hello everyone,
I seem to have stumbled on something. I noticed some funny scaling (or
rather, a lack of scaling) when performing AIMD simulations with DFT-D3.
I've tracked the problem to the calculation of the C9 term, when I have
that set to be calculated the simulation doesn't scale at all and it takes
a long time versus when I have it set to not be calculated, in which case I
do get linear scaling. Below is a timing report for the same exact INPUT
file, with the only difference being the calculation of the C9 term being
set to T and F, respectively. Something interesting is how different they
are. Again, these are from the exact same INPUT file, with the exact same
number of processors, and the only difference is the CALCULATE_C9_TERM is
set to T in the first case and F in the second.
timing with C9 calculation:
-------------------------------------------------------------------------------
-
-
- T I M I N G
-
-
-
-------------------------------------------------------------------------------
SUBROUTINE CALLS ASD SELF TIME TOTAL
TIME
MAXIMUM AVERAGE MAXIMUM AVERAGE
MAXIMUM
CP2K 1 1.0 0.008 0.019 772.815
772.821
qs_mol_dyn_low 1 2.0 0.002 0.003 772.503
772.517
qs_forces 11 3.9 0.002 0.002 771.252
771.258
qs_energies 11 4.9 0.001 0.001 765.300
765.307
velocity_verlet 10 3.0 0.002 0.002 686.306
686.323
qs_energies_init_hamiltonians 11 5.9 0.000 0.000 686.298
686.303
mp_sum_d 4362 11.6 619.789 685.783 619.789
685.783
calculate_dispersion_pairpot 11 6.9 65.911 685.372 685.486
685.491
scf_env_do_scf 11 5.9 0.000 0.001 76.654
76.655
scf_env_do_scf_inner_loop 139 6.7 0.004 0.030 71.107
71.468
rebuild_ks_matrix 150 8.5 0.000 0.001 52.308
52.341
qs_ks_build_kohn_sham_matrix 150 9.5 0.017 0.021 52.307
52.340
qs_ks_update_qs_env 151 7.7 0.001 0.001 47.661
47.693
pw_transfer 4061 12.2 0.239 0.267 35.026
35.154
fft_wrap_pw1pw2 3761 13.2 0.028 0.031 34.447
34.575
fft_wrap_pw1pw2_350 1661 14.3 2.499 2.594 32.640
32.789
qs_vxc_create 150 10.5 0.003 0.004 30.531
30.553
xc_vxc_pw_create 150 11.5 1.027 1.065 30.528
30.551
fft3d_ps 3761 15.2 12.507 12.766 28.236
28.478
qs_rho_update_rho 150 7.8 0.001 0.001 18.349
18.356
calculate_rho_elec 150 8.8 3.941 4.247 18.348
18.355
xc_rho_set_and_dset_create 150 12.5 0.188 0.193 16.380
16.402
sum_up_and_integrate 150 10.5 0.079 0.082 15.635
15.650
integrate_v_rspace 150 11.5 4.076 4.275 15.556
15.572
-------------------------------------------------------------------------------
Without c9 calculation
-------------------------------------------------------------------------------
-
-
- T I M I N G
-
-
-
-------------------------------------------------------------------------------
SUBROUTINE CALLS ASD SELF TIME TOTAL
TIME
MAXIMUM AVERAGE MAXIMUM AVERAGE
MAXIMUM
CP2K 1 1.0 0.010 0.023 98.369
98.371
qs_mol_dyn_low 1 2.0 0.002 0.003 98.009
98.019
qs_forces 11 3.9 0.002 0.002 97.695
97.697
qs_energies 11 4.9 0.001 0.001 91.392
91.395
scf_env_do_scf 11 5.9 0.000 0.001 84.901
84.903
scf_env_do_scf_inner_loop 139 6.7 0.004 0.028 78.845
79.075
velocity_verlet 10 3.0 0.002 0.002 67.867
67.871
rebuild_ks_matrix 150 8.5 0.000 0.001 55.485
55.544
qs_ks_build_kohn_sham_matrix 150 9.5 0.017 0.020 55.484
55.543
qs_ks_update_qs_env 151 7.7 0.001 0.001 50.557
50.612
pw_transfer 3461 12.2 0.206 0.259 37.722
37.838
fft_wrap_pw1pw2 3161 13.2 0.027 0.033 37.180
37.297
fft_wrap_pw1pw2_350 1661 14.3 2.439 2.559 34.568
34.695
qs_vxc_create 150 10.5 0.003 0.004 31.774
31.801
xc_vxc_pw_create 150 11.5 1.015 1.048 31.771
31.798
fft3d_ps 3161 15.2 12.404 12.688 31.074
31.311
qs_rho_update_rho 150 7.8 0.001 0.001 20.139
20.150
calculate_rho_elec 150 8.8 3.909 4.230 20.138
20.149
sum_up_and_integrate 150 10.5 0.077 0.081 17.172
17.191
integrate_v_rspace 150 11.5 4.070 4.271 17.094
17.111
xc_rho_set_and_dset_create 150 12.5 0.185 0.190 16.948
16.973
rs_pw_transfer 1822 12.1 0.020 0.024 15.129
15.437
mp_alltoall_z22v 3161 17.2 15.000 15.294 15.000
15.294
density_rs2pw 150 9.8 0.007 0.009 14.718
14.977
qs_scf_new_mos 139 7.7 0.001 0.001 14.120
14.173
qs_scf_loop_do_ot 139 8.7 0.001 0.001 14.120
14.173
dbcsr_multiply_generic 2770 12.6 0.073 0.082 13.810
13.940
ot_scf_mini 139 9.7 0.005 0.006 13.225
13.279
potential_pw2rs 150 12.5 0.010 0.013 12.324
12.362
x_to_yz 1050 16.5 1.546 1.618 9.447
9.550
multiply_cannon 2770 13.6 0.293 0.307 8.197
8.790
yz_to_x 911 16.0 1.073 1.104 8.140
8.355
mp_waitall_1 427928 15.4 7.704 8.344 7.704
8.344
ot_mini 139 10.7 0.001 0.001 8.062
8.132
qs_ot_get_derivative 139 11.7 0.002 0.002 6.714
6.768
xc_functional_eval 150 13.5 0.002 0.004 6.023
6.043
pbe_lda_eval 150 14.5 6.021 6.041 6.021
6.041
init_scf_loop 12 6.8 0.000 0.000 5.780
5.780
qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 4.981
4.985
rs_pw_transfer_PW2RS_350 161 14.2 2.271 2.444 4.256
4.435
mp_waitany 18660 14.0 3.701 4.433 3.701
4.433
make_m2s 5540 13.6 0.087 0.097 4.247
4.358
rs_pw_transfer_RS2PW_350 161 11.6 2.137 2.399 3.706
3.947
make_images 5540 14.6 0.322 0.341 3.720
3.811
qs_ot_get_derivative_taylor 110 12.9 0.003 0.004 3.646
3.691
qs_energies_init_hamiltonians 11 5.9 0.000 0.000 3.259
3.259
multiply_cannon_multrec 33240 14.6 2.695 3.049 2.705
3.060
qs_ot_get_p 151 10.5 0.001 0.001 2.871
2.945
init_scf_run 11 5.9 0.000 0.001 2.925
2.925
scf_env_initial_rho_setup 11 6.9 0.000 0.000 2.925
2.925
multiply_cannon_metrocomm1 33240 14.6 0.049 0.053 2.285
2.914
multiply_cannon_metrocomm3 33240 14.6 0.040 0.045 0.735
2.682
mp_sum_d 4362 11.6 2.357 2.664 2.357
2.664
wfi_extrapolate 11 7.9 0.000 0.001 2.650
2.650
calculate_dispersion_pairpot 11 6.9 0.219 2.076 2.327
2.327
make_images_sizes 5540 15.6 0.006 0.007 1.394
1.992
mp_alltoall_i44 5540 16.6 1.388 1.986 1.388
1.986
-------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20161118/d7850540/attachment.htm>
More information about the CP2K-user
mailing list