[CP2K:8418] DFT-D3 Timing
hut... at chem.uzh.ch
hut... at chem.uzh.ch
Sat Nov 19 14:21:35 UTC 2016
Hi
although the implementation of the C9 term is linear scaling,
it has a very large prefactor, especially for the force term.
That is why we implemented the possibility to keep the
coordination numbers for C9 fixed:
REFERENCE_C9_TERM = .TRUE.
This already reduces the computational time by order(s) of magnitude.
I don't know why you have such a bad load balancing in your calculation
(that is the reason of your bad scaling).
I haven't looked into this for a long time and without more information
on your system it is impossible to guess.
regards
Juerg
--------------------------------------------------------------
Juerg Hutter Phone : ++41 44 635 4491
Institut für Chemie C FAX : ++41 44 635 6838
Universität Zürich E-mail: hut... at chem.uzh.ch
Winterthurerstrasse 190
CH-8057 Zürich, Switzerland
---------------------------------------------------------------
-----cp... at googlegroups.com wrote: -----To: cp2k <cp... at googlegroups.com>
From: Mike Ruggiero
Sent by: cp... at googlegroups.com
Date: 11/18/2016 09:22PM
Subject: [CP2K:8418] DFT-D3 Timing
Hello everyone,
I seem to have stumbled on something. I noticed some funny scaling (or rather, a lack of scaling) when performing AIMD simulations with DFT-D3. I've tracked the problem to the calculation of the C9 term, when I have that set to be calculated the simulation doesn't scale at all and it takes a long time versus when I have it set to not be calculated, in which case I do get linear scaling. Below is a timing report for the same exact INPUT file, with the only difference being the calculation of the C9 term being set to T and F, respectively. Something interesting is how different they are. Again, these are from the exact same INPUT file, with the exact same number of processors, and the only difference is the CALCULATE_C9_TERM is set to T in the first case and F in the second.
timing with C9 calculation:
------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.008 0.019 772.815 772.821 qs_mol_dyn_low 1 2.0 0.002 0.003 772.503 772.517 qs_forces 11 3.9 0.002 0.002 771.252 771.258 qs_energies 11 4.9 0.001 0.001 765.300 765.307 velocity_verlet 10 3.0 0.002 0.002 686.306 686.323 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 686.298 686.303 mp_sum_d 4362 11.6 619.789 685.783 619.789 685.783 calculate_dispersion_pairpot 11 6.9 65.911 685.372 685.486 685.491 scf_env_do_scf 11 5.9 0.000 0.001 76.654 76.655 scf_env_do_scf_inner_loop 139 6.7 0.004 0.030 71.107 71.468 rebuild_ks_matrix 150 8.5 0.000 0.001 52.308 52.341 qs_ks_build_kohn_sham_matrix 150 9.5 0.017 0.021 52.307 52.340 qs_ks_update_qs_env 151 7.7 0.001 0.001 47.661 47.693 pw_transfer 4061 12.2 0.239 0.267 35.026 35.154 fft_wrap_pw1pw2 3761 13.2 0.028 0.031 34.447 34.575 fft_wrap_pw1pw2_350 1661 14.3 2.499 2.594 32.640 32.789 qs_vxc_create 150 10.5 0.003 0.004 30.531 30.553 xc_vxc_pw_create 150 11.5 1.027 1.065 30.528 30.551 fft3d_ps 3761 15.2 12.507 12.766 28.236 28.478 qs_rho_update_rho 150 7.8 0.001 0.001 18.349 18.356 calculate_rho_elec 150 8.8 3.941 4.247 18.348 18.355 xc_rho_set_and_dset_create 150 12.5 0.188 0.193 16.380 16.402 sum_up_and_integrate 150 10.5 0.079 0.082 15.635 15.650 integrate_v_rspace 150 11.5 4.076 4.275 15.556 15.572 -------------------------------------------------------------------------------
Without c9 calculation
------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.010 0.023 98.369 98.371 qs_mol_dyn_low 1 2.0 0.002 0.003 98.009 98.019 qs_forces 11 3.9 0.002 0.002 97.695 97.697 qs_energies 11 4.9 0.001 0.001 91.392 91.395 scf_env_do_scf 11 5.9 0.000 0.001 84.901 84.903 scf_env_do_scf_inner_loop 139 6.7 0.004 0.028 78.845 79.075 velocity_verlet 10 3.0 0.002 0.002 67.867 67.871 rebuild_ks_matrix 150 8.5 0.000 0.001 55.485 55.544 qs_ks_build_kohn_sham_matrix 150 9.5 0.017 0.020 55.484 55.543 qs_ks_update_qs_env 151 7.7 0.001 0.001 50.557 50.612 pw_transfer 3461 12.2 0.206 0.259 37.722 37.838 fft_wrap_pw1pw2 3161 13.2 0.027 0.033 37.180 37.297 fft_wrap_pw1pw2_350 1661 14.3 2.439 2.559 34.568 34.695 qs_vxc_create 150 10.5 0.003 0.004 31.774 31.801 xc_vxc_pw_create 150 11.5 1.015 1.048 31.771 31.798 fft3d_ps 3161 15.2 12.404 12.688 31.074 31.311 qs_rho_update_rho 150 7.8 0.001 0.001 20.139 20.150 calculate_rho_elec 150 8.8 3.909 4.230 20.138 20.149 sum_up_and_integrate 150 10.5 0.077 0.081 17.172 17.191 integrate_v_rspace 150 11.5 4.070 4.271 17.094 17.111 xc_rho_set_and_dset_create 150 12.5 0.185 0.190 16.948 16.973 rs_pw_transfer 1822 12.1 0.020 0.024 15.129 15.437 mp_alltoall_z22v 3161 17.2 15.000 15.294 15.000 15.294 density_rs2pw 150 9.8 0.007 0.009 14.718 14.977 qs_scf_new_mos 139 7.7 0.001 0.001 14.120 14.173 qs_scf_loop_do_ot 139 8.7 0.001 0.001 14.120 14.173 dbcsr_multiply_generic 2770 12.6 0.073 0.082 13.810 13.940 ot_scf_mini 139 9.7 0.005 0.006 13.225 13.279 potential_pw2rs 150 12.5 0.010 0.013 12.324 12.362 x_to_yz 1050 16.5 1.546 1.618 9.447 9.550 multiply_cannon 2770 13.6 0.293 0.307 8.197 8.790 yz_to_x 911 16.0 1.073 1.104 8.140 8.355 mp_waitall_1 427928 15.4 7.704 8.344 7.704 8.344 ot_mini 139 10.7 0.001 0.001 8.062 8.132 qs_ot_get_derivative 139 11.7 0.002 0.002 6.714 6.768 xc_functional_eval 150 13.5 0.002 0.004 6.023 6.043 pbe_lda_eval 150 14.5 6.021 6.041 6.021 6.041 init_scf_loop 12 6.8 0.000 0.000 5.780 5.780 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 4.981 4.985 rs_pw_transfer_PW2RS_350 161 14.2 2.271 2.444 4.256 4.435 mp_waitany 18660 14.0 3.701 4.433 3.701 4.433 make_m2s 5540 13.6 0.087 0.097 4.247 4.358 rs_pw_transfer_RS2PW_350 161 11.6 2.137 2.399 3.706 3.947 make_images 5540 14.6 0.322 0.341 3.720 3.811 qs_ot_get_derivative_taylor 110 12.9 0.003 0.004 3.646 3.691 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 3.259 3.259 multiply_cannon_multrec 33240 14.6 2.695 3.049 2.705 3.060 qs_ot_get_p 151 10.5 0.001 0.001 2.871 2.945 init_scf_run 11 5.9 0.000 0.001 2.925 2.925 scf_env_initial_rho_setup 11 6.9 0.000 0.000 2.925 2.925 multiply_cannon_metrocomm1 33240 14.6 0.049 0.053 2.285 2.914 multiply_cannon_metrocomm3 33240 14.6 0.040 0.045 0.735 2.682 mp_sum_d 4362 11.6 2.357 2.664 2.357 2.664 wfi_extrapolate 11 7.9 0.000 0.001 2.650 2.650 calculate_dispersion_pairpot 11 6.9 0.219 2.076 2.327 2.327 make_images_sizes 5540 15.6 0.006 0.007 1.394 1.992 mp_alltoall_i44 5540 16.6 1.388 1.986 1.388 1.986 -------------------------------------------------------------------------------
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns... at googlegroups.com.
To post to this group, send email to cp... at googlegroups.com.
Visit this group at https://groups.google.com/group/cp2k.
For more options, visit https://groups.google.com/d/optout.
More information about the CP2K-user
mailing list