Performance Issue on GPU
Samuel Andermatt
samuel.a... at student.ethz.ch
Mon Nov 17 11:45:16 UTC 2014
Ok, thanks for emailing me the required data. There are a number of issues.
First only matrix multiplications and fft's can currently be accelerated by
GPU's. Looking at the timing sections your calculation is dominated by CPU
parts:
Total time: CP2K 609.771
Main bottlenecks: integrate_v_rspace 268.894
calculate_rho_elec 205.875
This is normal for smaller calculations, GPU's become more useful for
systems with 1000+ atoms.
The second problem is that only a small part (12.4%) of your
multiplications are ported to the GPU:
COUNTER CPU GPU
GPU%
number of processed stacks 179436 25344
12.4
This is a result of there not being kernels for your basis set. You will
have to manually add them:
Open: src/dbcsr/libsmm_acc/libcusmm/generaty.py
There is a section with triples just on the top of the file. Add to it:
triples += combinations(7,9,16,22)
Best
Samuel
P.S: The main parameter that determines that speed of the calculations that
you want to do is the CUTOFF parameter in CP2K_INPUT/FORCE_EVAL/DFT/MGRID.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20141117/41b02bb4/attachment.htm>
More information about the CP2K-user
mailing list