CUDA-DBCSR Statistics

Abhishek Bagusetty abhishek... at
Wed Oct 1 16:43:47 UTC 2014

Hi Users & Developers,

I am trying to do a MD run using CP2K-2.5.1 - CUDA build (sopt) with 
DBSCR-CUDA support. 


When I use all the 4 GPUs on-node or in other words (CPU:GPU = 1:4), I get 
the simulation to run but the DBCSR statistics indicate 
the the GPU usage is 0 and all the matrix computations are carried on CPU. 
I have configured the input file 
so the DBCSR uses the accelerators and surprisingly things don't happen as 
expected. (Input Snippet at the bottom) 

 COUNTER                                      CPU                  GPU      
 number of processed stacks                 10296                    
0       0.0
 matmuls inhomo. stacks                         0                    
0       0.0
 matmuls total                           11895585                    
0       0.0
 flops   1 x    1 x    8                  4664400                    
0       0.0
 flops   1 x    8 x    1                  3900000                    
0       0.0
 flops   1 x    1 x   16                 37315200                    
0       0.0
 flops   1 x   16 x    1                 31200000                    
0       0.0
 flops   1 x    4 x    8                 20092800                    
0       0.0
 flops   4 x    1 x    8                 20092800                    
0       0.0
 flops   1 x    8 x    4                 17472000                    
0       0.0
 flops   4 x    8 x    1                 17472000                    
0       0.0
 flops   1 x    4 x   16                160742400                    
0       0.0
 flops   4 x    1 x   16                160742400                    
0       0.0
 flops   1 x   16 x    4                139776000                    
0       0.0
 flops   4 x   16 x    1                139776000                    
0       0.0
 flops   4 x    4 x    8                 93230592                    
0       0.0
 flops   4 x    8 x    4                 78274560                    
0       0.0
 flops   4 x    4 x   16                745844736                    
0       0.0
 flops   4 x   16 x    4                626196480                    
0       0.0
 flops total                           2296792368                    
0       0.0
 marketing flops                       3478421232

My guess is that the CP2K-2.5.1 version is not yet fully configured to 
handle on-node multi-GPUs and when 
multiple GPUs are available, things go wrong. I am not sure if this 
interpretation makes sense. 

*CASE-2 *

I have tried choosing 1 GPU out of 4 GPUs (ID : 0,1,2,3) on-node (CPU:GPU = 
1:1) and tried to run the same simulation. 
I get these errors from DBSCR initialization,

Error report :
 dbcsr_cuda_stream_create failed
 libdbcsr| dbcsr_cuda_stream_create failed
 libdbcsr| Abnormal program termination, stopped by process number 0
CUDA Error: all CUDA-capable devices are busy or unavailable
My guess for this case is that the scheduler in the cluster might have 
assigned a GPU device ID other than 0 and the CP2K-2.5.1
might have the GPU device ID hard-coded to 0. This lead to the 
unavailability of necessary GPU device ID while creating streams, resulting 
in CUDA error. 

I am not able to understand, what is going wrong with the case where 
CPU:GPU = 1:4 not showing any GPU stats for 
DBCSR and CUDA error for the case where CPU:GPU = 1:1. My interpretations 
can be completely wrong and appreciate 
if some one can give an insight on what is going on.

*Architecture* : Super-Computing Cluster
Every node has 8 cores and 4 in-house GPU cards. 

*JOB details* : 1 Node, 1 Core and 4 GPUs (Case-1) / 1 GPU(Case-2) (on-node 

*Input File Snippet :* 


Thanks for your time and efforts,

Abhishek Bagusetty
PhD Student, Computational Modeling & Simulation
Center for Simulation and Modeling
Department of Chemical & Petroleum Engineering
University of Pittsburgh
Pittsburgh, PA - 15261
Office : 920 Benedum Hall

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the CP2K-user mailing list