[CP2K-user] CUDA RUNTIME API error: EventRecord failed with error cudaErrorInvalidResourceHandle

Alfio Lazzaro alfio.... at gmail.com
Fri Feb 5 09:43:47 UTC 2021


Well, what I need is the top (let's say up to "SCF WAVEFUNCTION 
OPTIMIZATION") and the bottom of the logs (starting at "DBCSR STATISTICS").

Il giorno venerdì 5 febbraio 2021 alle 09:24:34 UTC+1 singlebook ha scritto:

> Hello,  Alfio,
>
> Yes, there are 12 MPI ranks, each rank has only one thread.
> The output file is too large to upload, I only  put the head information 
> for the cpu version here, those files for gpu are not saved for the moment. 
> Whenever the workstation is idle, I will do more tests.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *DBCSR| CPU Multiplication 
> driver                                           XSMM DBCSR| Multrec 
> recursion limit                                              512 DBCSR| 
> Multiplication stack size                                           
> 1000 DBCSR| Maximum elements for images                                    
> UNLIMITED DBCSR| Multiplicative factor virtual 
> images                                   1 DBCSR| Use multiplication 
> densification                                       T DBCSR| Multiplication 
> size stacks                                             3 DBCSR| Use memory 
> pool for CPU allocation                                     F DBCSR| Number 
> of 3D layers                                               SINGLE DBCSR| 
> Use MPI memory allocation                                              
> F DBCSR| Use RMA 
> algorithm                                                      F DBCSR| Use 
> Communication thread                                               T DBCSR| 
> Communication thread load                                             
> 87 DBCSR| MPI: My node 
> id                                                        0 DBCSR| MPI: 
> Number of nodes                                                  48 DBCSR| 
> OMP: Current number of threads                                         
> 1 DBCSR| OMP: Max number of 
> threads                                             1 DBCSR| Split modifier 
> for TAS multiplication algorithm                  1.0E+00  **** **** 
> ******  **  PROGRAM STARTED AT               2021-02-04 09:18:01.088 ***** 
> ** ***  *** **   PROGRAM STARTED ON                                  
> k172 **    ****   ******    PROGRAM STARTED 
> BY                               chenwei ***** **    ** ** **   PROGRAM 
> PROCESS ID                                 52126  **** **  *******  **  
> PROGRAM STARTED IN /ncsfs02/chenwei/Machine 
> Learning/CP2                                           K/SiC CP2K| version 
> string:                                          CP2K version 8.1 CP2K| 
> source code revision number:                                  
> git:0b61f2f CP2K| cp2kflags: omp libint fftw3 libxc elpa parallel mpi3 
> scalapack xsmm plume CP2K|            d2 spglib libvori libbqb CP2K| is 
> freely available from                            https://www.cp2k.org/ 
> <https://www.cp2k.org/> CP2K| Program compiled at                          
> Thu Feb  4 08:49:28 CST 2021 CP2K| Program compiled 
> on                                                  k172 CP2K| Program 
> compiled for                                                local CP2K| 
> Data directory path                       
> /home/chenwei/src/cp2k-8.1/data CP2K| Input file 
> name                                                   SiC.inp GLOBAL| 
> Force Environment number                                              
> 1 GLOBAL| Basis set file name                                           
> BASIS_SET GLOBAL| Potential file name                                      
> GTH_POTENTIALS GLOBAL| MM Potential file 
> name                                     MM_POTENTIAL GLOBAL| Coordinate 
> file name                                      __STD_INPUT__ GLOBAL| Method 
> name                                                        CP2K GLOBAL| 
> Project name                                                   
> SiC_AIMD GLOBAL| Preferred FFT 
> library                                             FFTW3 GLOBAL| Preferred 
> diagonalization lib.                                     ELPA GLOBAL| Run 
> type                                                             MD GLOBAL| 
> All-to-all communication in single precision                          
> F GLOBAL| FFTs using library dependent 
> lengths                                  F GLOBAL| Global print 
> level                                                  LOW GLOBAL| MPI I/O 
> enabled                                                       T GLOBAL| 
> Total number of message passing processes                            
> 48 GLOBAL| Number of threads for this 
> process                                    1 GLOBAL| This output is from 
> process                                           0 GLOBAL| CPU model 
> name                Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz GLOBAL| 
> CPUID                                                              
> 1002 MEMORY| system memory details [Kb] MEMORY|                        rank 
> 0           min           max       average MEMORY| MemTotal            
> 131748504     131748504     131748504     131748504 MEMORY| 
> MemFree              67523260      67523260      67523260      
> 67523260 MEMORY| Buffers                  4712          4712          
> 4712          4712 MEMORY| Cached               56159648      56159648      
> 56159648      56159648 MEMORY| Slab                  2740508       
> 2740508       2740508       2740508 MEMORY| SReclaimable          
> 2447544       2447544       2447544       2447544 MEMORY| 
> MemLikelyFree       126135164     126135164     126135164     
> 126135164 GENERATE|  Preliminary Number of Bonds 
> generated:                             0 GENERATE|  Achieved consistency in 
> connectivity generation.*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * SCF WAVEFUNCTION OPTIMIZATION  Step     Update method      Time    
> Convergence         Total energy    Change  
> ------------------------------------------------------------------------------     
> 1 NoMix/Diag. 0.40E+00    0.3     3.80220882      -317.7175159821 
> -3.18E+02     2 Broy./Diag. 0.40E+00    0.6     0.43368094      
> -291.0370906460  2.67E+01     3 Broy./Diag. 0.40E+00    0.6     
> 0.23506554      -308.2043627628 -1.72E+01     4 Broy./Diag. 0.40E+00    
> 0.6     0.26390650      -309.7756477106 -1.57E+00     5 Broy./Diag. 
> 0.40E+00    0.6     0.00311711      -310.0196552337 -2.44E-01     6 
> Broy./Diag. 0.40E+00    0.6     0.01762115      -309.8687051316  
> 1.51E-01     7 Broy./Diag. 0.40E+00    0.6     0.00055086      
> -309.8505587170  1.81E-02     8 Broy./Diag. 0.40E+00    0.6     
> 0.00030811      -309.8516271774 -1.07E-03     9 Broy./Diag. 0.40E+00    
> 0.6     0.00001506      -309.8519055144 -2.78E-04    10 Broy./Diag. 
> 0.40E+00    0.6     0.00000129      -309.8519255844 -2.01E-05    11 
> Broy./Diag. 0.40E+00    0.6     0.00000032      -309.8519300365 
> -4.45E-06    12 Broy./Diag. 0.40E+00    0.6     0.00000002      
> -309.8519304271 -3.91E-07  *** SCF run converged in    12 steps ****
>
> Best wishes,
>
> Wei
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20210205/bf2f64a7/attachment.htm>


More information about the CP2K-user mailing list