[CP2K-user] [CP2K:11534] Approaching performance issues within CP2K
ar26... at gmail.com
ar26... at gmail.com
Mon Apr 8 08:46:36 UTC 2019
Dear all,
thank you so much for your help!! The problem was related with the hard
drive access (we don't know what exactly yet), but when I use an other
independent storage system calculations went fast again. And this is
consistent with the fact that writing routines (such as
write_mo_set_to_restart, write_mo_set_low, md_output,
md_write_output,write_trajectory,cp_fm_write_unformatted) moved up to the
more time consuming routines in the buggy storage system. Here I'm posting
the Timing tables for both storage systems for future reference:
*Good Storage System:*
-------------------------------------------------------------------------------
-
-
- T I M I N
G -
-
-
-------------------------------------------------------------------------------
SUBROUTINE CALLS ASD SELF TIME TOTAL
TIME
MAXIMUM AVERAGE MAXIMUM AVERAGE
MAXIMUM
CP2K 1 1.0 0.008 0.010 599.391
599.392
qs_mol_dyn_low 1 2.0 0.180 0.197 599.034
599.034
velocity_verlet 9000 3.0 0.662 1.528 595.269
596.953
qs_forces 9001 4.0 0.717 0.742 567.994
567.997
qs_energies 9001 5.0 0.246 0.261 551.228
551.289
scf_env_do_scf 9001 6.0 0.239 1.819 424.181
424.284
dbcsr_multiply_generic 505522 12.2 14.745 15.233 307.841
311.138
scf_env_do_scf_inner_loop 27090 7.0 0.619 2.523 288.842
291.719
qs_scf_new_mos 27090 8.0 0.085 0.088 250.800
251.407
qs_scf_loop_do_ot 27090 9.0 0.082 0.086 250.715
251.323
ot_scf_mini 27090 10.0 0.508 0.548 231.762
232.060
ot_mini 27090 11.0 0.086 0.089 157.113
157.814
qs_ot_get_derivative 27090 12.0 0.182 0.188 134.510
134.850
make_m2s 1011044 13.2 9.277 9.474 124.272
127.361
init_scf_loop 9001 7.0 0.106 0.114 126.428
126.742
multiply_cannon 505522 13.2 33.604 35.248 106.197
114.395
prepare_preconditioner 9001 8.0 0.014 0.018 95.925
96.168
make_preconditioner 9001 9.0 0.035 0.040 95.911
96.154
make_images 1011044 14.2 16.678 17.593 91.118
93.597
init_scf_run 9001 6.0 0.197 1.708 92.415
92.532
scf_env_initial_rho_setup 9001 7.0 0.100 0.693 92.201
92.422
wfi_extrapolate 9001 8.0 0.519 3.147 90.370
90.544
make_full_single_inverse 9001 10.0 0.306 0.327 86.393
86.653
copy_dbcsr_to_fm 108031 10.3 1.141 1.179 67.756
69.861
dbcsr_complete_redistribute 234064 11.5 7.835 8.949 60.482
61.557
cp_dbcsr_sm_fm_multiply 44993 9.4 0.551 0.579 60.769
61.297
qs_ot_get_derivative_taylor 27061 13.0 0.243 0.251 56.202
56.424
dbcsr_sym_m_v_mult 526492 13.0 7.007 7.334 46.943
48.687
arnoldi_generalized_ev 9001 11.0 0.102 0.106 41.654
41.718
dbcsr_create_new 7243687 14.2 23.194 24.523 38.770
41.294
gev_build_subspace 9004 12.0 0.939 1.071 36.235
36.296
mp_sum_dv 2375242 14.2 32.523 35.477 32.523
35.477
qs_ot_get_orbitals 27090 11.0 0.079 0.084 33.879
34.692
qs_ot_get_p 36091 10.5 0.121 0.131 32.562
33.987
make_images_sizes 1011044 15.2 0.953 1.005 22.001
28.224
qs_energies_compute_matrix_w 9001 6.0 0.027 0.032 26.787
27.449
calculate_w_matrix_ot 9001 7.0 0.168 0.186 26.760
27.422
mp_alltoall_i44 1011044 16.2 21.048 27.300 21.048
27.300
calculate_dm_sparse 36091 9.7 0.083 0.085 26.464
26.954
mp_sum_l 2239277 13.4 19.550 26.234 19.550
26.234
lnhc_particles 18000 4.0 0.065 0.070 24.401
26.060
arnoldi_normal_ev 45092 12.2 0.520 0.560 24.467
25.905
md_output 9000 3.0 0.017 0.018 2.420
25.679
md_write_output 9001 4.0 1.457 20.026 2.306
25.562
cp_dbcsr_sm_fm_multiply_core 44993 10.4 0.071 0.076 24.407
24.951
write_mo_set_to_restart 27090 8.0 1.214 14.124 23.238
24.266
apply_preconditioner_dbcsr 36091 13.0 0.035 0.036 23.291
24.072
apply_single 36091 14.0 0.055 0.057 23.256
24.037
make_images_data 1011044 15.2 7.650 9.213 20.594
23.868
rebuild_ks_matrix 36091 8.3 0.053 0.057 22.631
23.781
build_dftb_ks_matrix 36091 9.3 0.398 0.422 22.578
23.724
mp_alltoall_i22 378244 12.8 18.876 22.998 18.876
22.998
cp_gemm 98983 9.0 0.187 0.204 22.533
22.794
cp_gemm_fm_gemm 98983 10.0 0.140 0.151 22.346
22.605
cp_fm_gemm 98983 11.0 22.205 22.467 22.205
22.467
ot_diis_step 27090 12.0 0.342 0.361 21.376
21.392
dbcsr_make_images_dense 686684 14.8 3.822 3.920 17.689
21.195
qs_ks_update_qs_env 36091 8.0 0.101 0.110 20.180
21.189
copy_fm_to_dbcsr 126033 10.6 0.480 0.549 19.511
20.863
mp_cart_sub 216064 12.3 20.309 20.673 20.309
20.673
dbcsr_destroy 7044966 13.4 9.291 9.530 18.801
19.943
dbcsr_finalize 1560240 11.7 3.739 4.104 14.191
17.207
cp_dbcsr_plus_fm_fm_t_native 18002 9.0 0.155 0.160 16.363
16.463
dbcsr_make_dense_low 966932 15.6 5.003 5.079 11.784
16.345
build_subspace 45094 13.2 0.708 0.736 15.222
15.988
dbcsr_desymmetrize_deep 108031 11.3 1.674 1.849 14.476
15.858
write_mo_set_low 9002 9.0 0.112 1.543 15.093
15.822
cp_fm_write_unformatted 9002 10.0 14.974 15.786 14.981
15.806
ot_scf_init 9001 8.0 0.685 0.725 15.140
15.324
mp_waitall_1 ******* 15.9 12.860 15.123 12.860
15.123
reorthogonalize_vectors 9000 9.0 0.032 0.034 14.343
14.431
make_basis_sm 9001 10.0 0.096 0.102 14.314
14.405
mp_sum_d 791629 12.9 10.590 14.017 10.590
14.017
dbcsr_iterator_start 9891703 15.1 9.641 10.815 11.803
13.413
hybrid_alltoall_any 1119075 15.8 7.316 7.963 10.994
12.837
dbcsr_data_new ******* 14.9 9.573 10.182 11.426
12.351
dbcsr_copy 1801293 13.1 3.838 4.091 11.071
12.048
-------------------------------------------------------------------------------
*Buggy Storage System:*
-------------------------------------------------------------------------------
-
-
- T I M I N
G -
-
-
-------------------------------------------------------------------------------
SUBROUTINE CALLS ASD SELF TIME TOTAL
TIME
MAXIMUM AVERAGE MAXIMUM AVERAGE
MAXIMUM
CP2K 1 1.0 0.171 0.278 3945.748
3945.750
qs_mol_dyn_low 1 2.0 0.176 0.193 3944.466
3944.660
velocity_verlet 9000 3.0 0.643 1.489 3847.823
3915.148
qs_forces 9001 4.0 0.708 0.746 2837.016
2837.022
qs_energies 9001 5.0 0.252 0.293 2820.446
2820.485
scf_env_do_scf 9001 6.0 0.245 1.859 2694.366
2694.470
scf_env_do_scf_inner_loop 27090 7.0 0.606 2.484 2369.243
2563.300
*write_mo_set_to_restart* 27090 8.0 130.929 2089.698 2097.267
2289.501
* write_mo_set_low* 9002 9.0 11.970 191.268 1959.546
2077.260
* cp_fm_write_unformatted* 9002 10.0 1947.569 2077.236 1947.576
2077.244
mp_sum_dv 2375242 14.2 1016.794 1085.434 1016.794
1085.434
lnhc_particles 18000 4.0 0.065 0.071 1008.591
1075.896
*md_output* 9000 3.0 0.016 0.018 68.228
1075.670
* md_write_output* 9001 4.0 31.858 506.433 67.925
1075.564
* write_trajectory* 36004 5.0 24.029 378.166 35.957
569.013
dbcsr_multiply_generic 505522 12.2 14.565 14.965 303.194
305.824
copy_dbcsr_to_fm 108031 10.3 1.127 1.162 257.968
272.739
qs_scf_new_mos 27090 8.0 0.083 0.088 246.784
247.397
qs_scf_loop_do_ot 27090 9.0 0.080 0.081 246.701
247.313
ot_scf_mini 27090 10.0 0.499 0.541 228.128
228.478
mp_alltoall_i22 378244 12.8 209.931 226.475 209.931
226.475
dbcsr_desymmetrize_deep 108031 11.3 1.656 1.869 205.455
219.499
write_particle_coordinates 9001 6.0 11.928 190.847 11.928
190.847
ot_mini 27090 11.0 0.084 0.089 154.636
155.248
qs_ot_get_derivative 27090 12.0 0.179 0.183 132.444
132.776
make_m2s 1011044 13.2 9.192 9.398 122.520
125.393
init_scf_loop 9001 7.0 0.102 0.110 124.960
125.277
multiply_cannon 505522 13.2 33.331 34.305 104.941
111.596
prepare_preconditioner 9001 8.0 0.014 0.016 94.714
94.953
make_preconditioner 9001 9.0 0.040 0.048 94.700
94.939
make_images 1011044 14.2 16.528 17.560 89.723
92.051
init_scf_run 9001 6.0 0.197 1.733 91.409
91.521
scf_env_initial_rho_setup 9001 7.0 0.098 0.688 91.195
91.412
wfi_extrapolate 9001 8.0 0.519 3.144 89.359
89.509
make_full_single_inverse 9001 10.0 0.312 0.352 85.202
85.482
-------------------------------------------------------------------------------
Best regards,
Alejandro.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20190408/43606c88/attachment.htm>
More information about the CP2K-user
mailing list