[CP2K-user] [CP2K:11534] Approaching performance issues within CP2K

ar26... at gmail.com ar26... at gmail.com
Mon Apr 8 08:46:36 UTC 2019


Dear all,

thank you so much for your help!! The problem was related with the hard 
drive access (we don't know what exactly yet), but when I use an other 
independent storage system calculations went fast again. And this is 
consistent with the fact that writing routines (such as 
write_mo_set_to_restart, write_mo_set_low, md_output, 
md_write_output,write_trajectory,cp_fm_write_unformatted) moved up to the 
more time consuming routines in the buggy storage system. Here I'm posting 
the Timing tables for both storage systems for future reference:

*Good Storage System:*
-------------------------------------------------------------------------------
 -                                                                             
-
 -                                T I M I N 
G                                  -
 -                                                                             
-
 -------------------------------------------------------------------------------
 SUBROUTINE                       CALLS  ASD         SELF TIME        TOTAL 
TIME
                                MAXIMUM       AVERAGE  MAXIMUM  AVERAGE  
MAXIMUM
 CP2K                                 1  1.0    0.008    0.010  599.391  
599.392
 qs_mol_dyn_low                       1  2.0    0.180    0.197  599.034  
599.034
 velocity_verlet                   9000  3.0    0.662    1.528  595.269  
596.953
 qs_forces                         9001  4.0    0.717    0.742  567.994  
567.997
 qs_energies                       9001  5.0    0.246    0.261  551.228  
551.289
 scf_env_do_scf                    9001  6.0    0.239    1.819  424.181  
424.284
 dbcsr_multiply_generic          505522 12.2   14.745   15.233  307.841  
311.138
 scf_env_do_scf_inner_loop        27090  7.0    0.619    2.523  288.842  
291.719
 qs_scf_new_mos                   27090  8.0    0.085    0.088  250.800  
251.407
 qs_scf_loop_do_ot                27090  9.0    0.082    0.086  250.715  
251.323
 ot_scf_mini                      27090 10.0    0.508    0.548  231.762  
232.060
 ot_mini                          27090 11.0    0.086    0.089  157.113  
157.814
 qs_ot_get_derivative             27090 12.0    0.182    0.188  134.510  
134.850
 make_m2s                       1011044 13.2    9.277    9.474  124.272  
127.361
 init_scf_loop                     9001  7.0    0.106    0.114  126.428  
126.742
 multiply_cannon                 505522 13.2   33.604   35.248  106.197  
114.395
 prepare_preconditioner            9001  8.0    0.014    0.018   95.925   
96.168
 make_preconditioner               9001  9.0    0.035    0.040   95.911   
96.154
 make_images                    1011044 14.2   16.678   17.593   91.118   
93.597
 init_scf_run                      9001  6.0    0.197    1.708   92.415   
92.532
 scf_env_initial_rho_setup         9001  7.0    0.100    0.693   92.201   
92.422
 wfi_extrapolate                   9001  8.0    0.519    3.147   90.370   
90.544
 make_full_single_inverse          9001 10.0    0.306    0.327   86.393   
86.653
 copy_dbcsr_to_fm                108031 10.3    1.141    1.179   67.756   
69.861
 dbcsr_complete_redistribute     234064 11.5    7.835    8.949   60.482   
61.557
 cp_dbcsr_sm_fm_multiply          44993  9.4    0.551    0.579   60.769   
61.297
 qs_ot_get_derivative_taylor      27061 13.0    0.243    0.251   56.202   
56.424
 dbcsr_sym_m_v_mult              526492 13.0    7.007    7.334   46.943   
48.687
 arnoldi_generalized_ev            9001 11.0    0.102    0.106   41.654   
41.718
 dbcsr_create_new               7243687 14.2   23.194   24.523   38.770   
41.294
 gev_build_subspace                9004 12.0    0.939    1.071   36.235   
36.296
 mp_sum_dv                      2375242 14.2   32.523   35.477   32.523   
35.477
 qs_ot_get_orbitals               27090 11.0    0.079    0.084   33.879   
34.692
 qs_ot_get_p                      36091 10.5    0.121    0.131   32.562   
33.987
 make_images_sizes              1011044 15.2    0.953    1.005   22.001   
28.224
 qs_energies_compute_matrix_w      9001  6.0    0.027    0.032   26.787   
27.449
 calculate_w_matrix_ot             9001  7.0    0.168    0.186   26.760   
27.422
 mp_alltoall_i44                1011044 16.2   21.048   27.300   21.048   
27.300
 calculate_dm_sparse              36091  9.7    0.083    0.085   26.464   
26.954
 mp_sum_l                       2239277 13.4   19.550   26.234   19.550   
26.234
 lnhc_particles                   18000  4.0    0.065    0.070   24.401   
26.060
 arnoldi_normal_ev                45092 12.2    0.520    0.560   24.467   
25.905
 md_output                         9000  3.0    0.017    0.018    2.420   
25.679
 md_write_output                   9001  4.0    1.457   20.026    2.306   
25.562
 cp_dbcsr_sm_fm_multiply_core     44993 10.4    0.071    0.076   24.407   
24.951
 write_mo_set_to_restart          27090  8.0    1.214   14.124   23.238   
24.266
 apply_preconditioner_dbcsr       36091 13.0    0.035    0.036   23.291   
24.072
 apply_single                     36091 14.0    0.055    0.057   23.256   
24.037
 make_images_data               1011044 15.2    7.650    9.213   20.594   
23.868
 rebuild_ks_matrix                36091  8.3    0.053    0.057   22.631   
23.781
 build_dftb_ks_matrix             36091  9.3    0.398    0.422   22.578   
23.724
 mp_alltoall_i22                 378244 12.8   18.876   22.998   18.876   
22.998
 cp_gemm                          98983  9.0    0.187    0.204   22.533   
22.794
 cp_gemm_fm_gemm                  98983 10.0    0.140    0.151   22.346   
22.605
 cp_fm_gemm                       98983 11.0   22.205   22.467   22.205   
22.467
 ot_diis_step                     27090 12.0    0.342    0.361   21.376   
21.392
 dbcsr_make_images_dense         686684 14.8    3.822    3.920   17.689   
21.195
 qs_ks_update_qs_env              36091  8.0    0.101    0.110   20.180   
21.189
 copy_fm_to_dbcsr                126033 10.6    0.480    0.549   19.511   
20.863
 mp_cart_sub                     216064 12.3   20.309   20.673   20.309   
20.673
 dbcsr_destroy                  7044966 13.4    9.291    9.530   18.801   
19.943
 dbcsr_finalize                 1560240 11.7    3.739    4.104   14.191   
17.207
 cp_dbcsr_plus_fm_fm_t_native     18002  9.0    0.155    0.160   16.363   
16.463
 dbcsr_make_dense_low            966932 15.6    5.003    5.079   11.784   
16.345
 build_subspace                   45094 13.2    0.708    0.736   15.222   
15.988
 dbcsr_desymmetrize_deep         108031 11.3    1.674    1.849   14.476   
15.858
 write_mo_set_low                  9002  9.0    0.112    1.543   15.093   
15.822
 cp_fm_write_unformatted           9002 10.0   14.974   15.786   14.981   
15.806
 ot_scf_init                       9001  8.0    0.685    0.725   15.140   
15.324
 mp_waitall_1                   ******* 15.9   12.860   15.123   12.860   
15.123
 reorthogonalize_vectors           9000  9.0    0.032    0.034   14.343   
14.431
 make_basis_sm                     9001 10.0    0.096    0.102   14.314   
14.405
 mp_sum_d                        791629 12.9   10.590   14.017   10.590   
14.017
 dbcsr_iterator_start           9891703 15.1    9.641   10.815   11.803   
13.413
 hybrid_alltoall_any            1119075 15.8    7.316    7.963   10.994   
12.837
 dbcsr_data_new                 ******* 14.9    9.573   10.182   11.426   
12.351
 dbcsr_copy                     1801293 13.1    3.838    4.091   11.071   
12.048
 -------------------------------------------------------------------------------




*Buggy Storage System:*


-------------------------------------------------------------------------------
 -                                                                             
-
 -                                T I M I N 
G                                  -
 -                                                                             
-
 -------------------------------------------------------------------------------
 SUBROUTINE                       CALLS  ASD         SELF TIME        TOTAL 
TIME
                                MAXIMUM       AVERAGE  MAXIMUM  AVERAGE  
MAXIMUM
 CP2K                                 1  1.0    0.171    0.278 3945.748 
3945.750
 qs_mol_dyn_low                       1  2.0    0.176    0.193 3944.466 
3944.660
 velocity_verlet                   9000  3.0    0.643    1.489 3847.823 
3915.148
 qs_forces                         9001  4.0    0.708    0.746 2837.016 
2837.022
 qs_energies                       9001  5.0    0.252    0.293 2820.446 
2820.485
 scf_env_do_scf                    9001  6.0    0.245    1.859 2694.366 
2694.470
 scf_env_do_scf_inner_loop        27090  7.0    0.606    2.484 2369.243 
2563.300
 *write_mo_set_to_restart*          27090  8.0  130.929 2089.698 2097.267 
2289.501
* write_mo_set_low*                  9002  9.0   11.970  191.268 1959.546 
2077.260
* cp_fm_write_unformatted*           9002 10.0 1947.569 2077.236 1947.576 
2077.244
 mp_sum_dv                      2375242 14.2 1016.794 1085.434 1016.794 
1085.434
 lnhc_particles                   18000  4.0    0.065    0.071 1008.591 
1075.896
 *md_output*                         9000  3.0    0.016    0.018   68.228 
1075.670
* md_write_output*                   9001  4.0   31.858  506.433   67.925 
1075.564
* write_trajectory*                 36004  5.0   24.029  378.166   35.957  
569.013
 dbcsr_multiply_generic          505522 12.2   14.565   14.965  303.194  
305.824
 copy_dbcsr_to_fm                108031 10.3    1.127    1.162  257.968  
272.739
 qs_scf_new_mos                   27090  8.0    0.083    0.088  246.784  
247.397
 qs_scf_loop_do_ot                27090  9.0    0.080    0.081  246.701  
247.313
 ot_scf_mini                      27090 10.0    0.499    0.541  228.128  
228.478
 mp_alltoall_i22                 378244 12.8  209.931  226.475  209.931  
226.475
 dbcsr_desymmetrize_deep         108031 11.3    1.656    1.869  205.455  
219.499
 write_particle_coordinates        9001  6.0   11.928  190.847   11.928  
190.847
 ot_mini                          27090 11.0    0.084    0.089  154.636  
155.248
 qs_ot_get_derivative             27090 12.0    0.179    0.183  132.444  
132.776
 make_m2s                       1011044 13.2    9.192    9.398  122.520  
125.393
 init_scf_loop                     9001  7.0    0.102    0.110  124.960  
125.277
 multiply_cannon                 505522 13.2   33.331   34.305  104.941  
111.596
 prepare_preconditioner            9001  8.0    0.014    0.016   94.714   
94.953
 make_preconditioner               9001  9.0    0.040    0.048   94.700   
94.939
 make_images                    1011044 14.2   16.528   17.560   89.723   
92.051
 init_scf_run                      9001  6.0    0.197    1.733   91.409   
91.521
 scf_env_initial_rho_setup         9001  7.0    0.098    0.688   91.195   
91.412
 wfi_extrapolate                   9001  8.0    0.519    3.144   89.359   
89.509
 make_full_single_inverse          9001 10.0    0.312    0.352   85.202   
85.482
 -------------------------------------------------------------------------------






Best regards,


Alejandro.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20190408/43606c88/attachment.htm>


More information about the CP2K-user mailing list