No speedup using Intel MKL libraries?

Alfio Lazzaro alfio.... at
Mon Nov 6 12:11:53 UTC 2017

Dear Farah,
OK, this is the comparison of the two runs for functions where I see the 
highest timing discrepancy (time in seconds, second column w/ MKL, third 
column w/o MKL)

dbcsr_make_untransposed_blocks     4.139     1.591
cp_fm_gemm                         5.691     1.087
setup_rec_index_2d                 6.330     1.741
cp_fm_cholesky_decompose          11.539     1.703  
cp_fm_cholesky_invert             26.048     3.031 

Well, personally I don't understand the differences in the 1st and 3rd 
line, likely it was a fluctuation.
For the other lines, these are MKL related (DGEMM and 
Cholesky decomposition). My suspicious is that you are using MKL in 
sequential, while Openblas is somehow using threads. A way to test it is to 
run with a single thread (or less threads in general), the difference 
should become smaller. I would also suggest to use the PSMP version.


Il giorno giovedì 2 novembre 2017 15:33:13 UTC+1, Faraz H ha scritto:
> Thanks, I am attaching the output of two runs. One with the gcc4.9 
> executable and other with the MKL libraries and gcc4.9. Interestingly the 
> results are not always consistent when I run the model multiple times. 
> Sometimes the MKL one is faster by ~30 seconds overall. Sometimes slower. 
> So perhaps something going on my system. Curious what you see.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the CP2K-user mailing list