[CP2K-user] CP2K not effectively using GPUs

Fabian Ducry fabia... at gmail.com
Wed Apr 15 14:15:43 UTC 2020


Hi everyone,

I have noticed that in some cases cp2k is not effectively using the GPUs 
present on the node, while for similar atom configurations (and identical 
input file, attached below) the GPUs are used. I wonder what causes these 
differences?

Both simulationens are performed using the psmp version with 256 MPI ranks 
and 3 OMP threads each, and 64 GPUs on the Piz Daint cluster. The number of 
atoms (3655 and 3746) is slightly different but the species are the same, 
Pt, Hf and O. The size of the matrix blocks to process is also the same. 
>From the summary at the end of the output we see the that in the first 
simulation the GPUs account for 99.9% of the flops while in the second one 
only 6% of the flops are performed on the GPU.

 COUNTER                                         TOTAL       BLAS       
SMM       ACC
...
 flops    32 x    32 x    13      100202445398016       0.0%      0.0%    
100.0%
 flops    10 x    32 x    32      182711840276480       0.0%      0.4%     
99.6%
 flops    10 x    32 x    10      217680464691200       0.0%      0.0%    
100.0%
 flops    32 x    32 x    10      296933113405440       0.0%      0.0%    
100.0%
 flops inhomo. stacks                                   0       0.0%      
0.0%       0.0%
 flops total                                1.179570E+15       0.0%      
0.1%     99.9%
 flops max/rank                         4.725263E+12       0.0%      
0.1%     99.9%
 matmuls inhomo. stacks                              0       0.0%      
0.0%      0.0%
 matmuls total                            95474737068       0.0%      
0.0%    100.0%
 number of processed stacks              3880884       0.0%      0.1%     
99.9%
 average stack size                                               0.0   
20228.0   24603.7

while here they are not:
 COUNTER                                         TOTAL       BLAS       
SMM       ACC
...
 flops    32 x    32 x    13       86563418093568       0.0%    100.0%      
0.0%
 flops    10 x    32 x    32      178933868625920       0.0%    100.0%      
0.0%
 flops    10 x    32 x    10      229355051443200       0.0%     90.9%      
9.1%
 flops    32 x    32 x    10      290793635389440       0.0%    100.0%      
0.0%
 flops inhomo. stacks                                   0       0.0%        
0.0%      0.0%
 flops total                                1.134639E+15       0.0%      
94.0%      6.0%
 flops max/rank                         4.704995E+12       0.0%      
92.0%      8.0%
 matmuls inhomo. stacks                              0       0.0%        
0.0%      0.0%
 matmuls total                            93442871379       0.0%       
94.0%      6.0%
 number of processed stacks              3893648       0.0%       
93.3%      6.7%
 average stack size                                               0.0   
24183.7   21442.3


I have some GPU related GLOBAL settings:
&GLOBAL
  PROJECT negf-step-282
  RUN_TYPE ENERGY
  PRINT_LEVEL MEDIUM
  EXTENDED_FFT_LENGTHS
  WALLTIME 17600
  &FM
    FORCE_BLOCK_SIZE
    TYPE_OF_MATRIX_MULTIPLICATION DBCSR_MM
  &END FM
&END GLOBAL

the full input file is attached as are the outputs for both simulations.

I am glad for any pointer to how I should change the settings.

Best,
Fabian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200415/f3dc39e2/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cp2k.inp
Type: chemical/x-gamess-input
Size: 1704 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200415/f3dc39e2/attachment.inp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cp2k1.out
Type: application/octet-stream
Size: 375866 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200415/f3dc39e2/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cp2k2.out
Type: application/octet-stream
Size: 368657 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200415/f3dc39e2/attachment-0001.obj>


More information about the CP2K-user mailing list