How to use GPU to accelerate a calculations

Ole Schütt o... at schuett.name
Fri Nov 20 12:58:30 UTC 2015


Hi Jianfeng,

try replacing -D__DBCSR_CUDA  with -D__DBCSR_ACC. That might just do the 
trick.

-Ole

Am Freitag, 20. November 2015 13:04:25 UTC+1 schrieb jjf... at yahoo.com.cn:
>
> Dear all,
> I have compiled my CP2K with:
> DFLAGS   = -D__INTEL -D__FFTSG  -D__parallel -D__BLACS -D__SCALAPACK 
> -D__FFTW3 -D__FFTMKL  -D__LIBINT -D__LIBXC2 -D__ACC -D__CUDAPW 
> -D__DBCSR_CUDA  -D__LIBINT_MAX_AM=6  -D__LIBDERIV_MAX_AM1=5
>
> I have revised the generate.py as following:
>     triples  = combinations(23)                 # blocked H2O (benchmark)
>     triples += combinations(6)                  # idem min basis
>     triples += combinations(14,16,29)           # RPA water
>     triples += combinations(5, 32, 13, 24, 26)
>     triples += combinations(9, 32, 22)
>     triples += combinations(32)
>     triples += combinations(64)
>     triples += combinations(78)
>     triples += combinations(16,29,55)
>     triples += combinations(13,32,13)
>     triples += combinations(26,32,13)
>     triples += combinations(13,32,26)
>     triples += combinations(26,32,26)
>     triples += combinations(13,32,9)
>     triples += combinations(9,32,13)
>     triples += combinations(26,32,9)
>     triples += combinations(9,32,26)
>
> However, I don't find the acceleration from MY GPU (K20m). The DBCSR 
> STATISTICS was as following:
>
>  COUNTER                                      CPU                  
> ACC      ACC%
>  number of processed stacks                388804                    
> 0       0.0
>  matmuls inhomo. stacks                  12328621                    
> 0       0.0
>  matmuls total                          242005257                    
> 0       0.0
>  flops   9 x   21 x    9               1073834496                    
> 0       0.0
>  flops 224 x  224 x  224               1753350144                    
> 0       0.0
>  flops 224 x  224 x  245               1917726720                    
> 0       0.0
>  flops 224 x  245 x  224               1917726720                    
> 0       0.0
>  flops 245 x  224 x  224               1917726720                    
> 0       0.0
>  flops 224 x  245 x  245               2097513600                    
> 0       0.0
>  flops 245 x  224 x  245               2097513600                    
> 0       0.0
>  flops 245 x  245 x  224               2097513600                    
> 0       0.0
>  flops 245 x  245 x  245               2294155500                    
> 0       0.0
>  flops   9 x    9 x  213               2556480528                    
> 0       0.0
>  flops  26 x   21 x    9               3102188544                    
> 0       0.0
>  flops   9 x   21 x   26               3102188544                    
> 0       0.0
>  flops 224 x  224 x  256               4007657472                    
> 0       0.0
>  flops 224 x  256 x  224               4007657472                    
> 0       0.0
>  flops 256 x  224 x  224               4007657472                    
> 0       0.0
>  flops 224 x  245 x  256               4383375360                    
> 0       0.0
>  flops 224 x  256 x  245               4383375360                    
> 0       0.0
>  flops 245 x  224 x  256               4383375360                    
> 0       0.0
>  flops 245 x  256 x  224               4383375360                    
> 0       0.0
>  flops 256 x  224 x  245               4383375360                    
> 0       0.0
>  flops 256 x  245 x  224               4383375360                    
> 0       0.0
>  flops   9 x   21 x   13               4737435066                    
> 0       0.0
>  flops  13 x   21 x    9               4737435066                    
> 0       0.0
>  flops 245 x  245 x  256               4794316800                    
> 0       0.0
>  flops 245 x  256 x  245               4794316800                    
> 0       0.0
>  flops 256 x  245 x  245               4794316800                    
> 0       0.0
>  flops  26 x    9 x  213               7222105800                    
> 0       0.0
>  flops   9 x   26 x  213               7247226168                    
> 0       0.0
>  flops  26 x   21 x   26               8961878016                    
> 0       0.0
>  flops 256 x  256 x  224               9160359936                    
> 0       0.0
>  flops 224 x  256 x  256               9160359936                    
> 0       0.0
>  flops 256 x  224 x  256               9160359936                    
> 0       0.0
>  flops   9 x    9 x  256               9217732608                    
> 0       0.0
>  flops 929 x  224 x  224               9882062848                    
> 0       0.0
>  flops 256 x  256 x  245              10019143680                    
> 0       0.0
>  flops 245 x  256 x  256              10019143680                    
> 0       0.0
>  flops 256 x  245 x  256              10019143680                    
> 0       0.0
>  flops 929 x  224 x  245              10808506240                    
> 0       0.0
>  flops 929 x  245 x  224              10808506240                    
> 0       0.0
>  flops   9 x   13 x  213              11030981598                    
> 0       0.0
>  flops  13 x    9 x  213              11065522104                    
> 0       0.0
>  flops 929 x  245 x  245              11821803700                    
> 0       0.0
>  flops 224 x  224 x  929              11933057024                    
> 0       0.0
>  flops 224 x  245 x  929              13051781120                    
> 0       0.0
>  flops 245 x  224 x  929              13051781120                    
> 0       0.0
>  flops  26 x   21 x   13              13997099844                    
> 0       0.0
>  flops  13 x   21 x   26              13997099844                    
> 0       0.0
>  flops 245 x  245 x  929              14275385600                    
> 0       0.0
>  flops  13 x   21 x   13              14415243024                    
> 0       0.0
>  flops 256 x  256 x  256              20937965568                    
> 0       0.0
>  flops  26 x   26 x  213              21335565888                    
> 0       0.0
>  flops 929 x  224 x  256              22587572224                    
> 0       0.0
>  flops 929 x  256 x  224              22587572224                    
> 0       0.0
>  flops 929 x  245 x  256              24705157120                    
> 0       0.0
>  flops 929 x  256 x  245              24705157120                    
> 0       0.0
>  flops  26 x    9 x  256              26040268800                    
> 0       0.0
>  flops   9 x   26 x  256              26130843648                    
> 0       0.0
>  flops 224 x  256 x  929              27275558912                    
> 0       0.0
>  flops 256 x  224 x  929              27275558912                    
> 0       0.0
>  flops 924 x  224 x  224              29486628864                    
> 0       0.0
>  flops 245 x  256 x  929              29832642560                    
> 0       0.0
>  flops 256 x  245 x  929              29832642560                    
> 0       0.0
>  flops 924 x  224 x  245              32251000320                    
> 0       0.0
>  flops 924 x  245 x  224              32251000320                    
> 0       0.0
>  flops  13 x   26 x  213              32611122180                    
> 0       0.0
>  flops  26 x   13 x  213              32674620888                    
> 0       0.0
>  flops  13 x   13 x  213              33962737536                    
> 0       0.0
>  flops 924 x  245 x  245              35274531600                    
> 0       0.0
>  flops 224 x  224 x  924              35606495232                    
> 0       0.0
>  flops 224 x  245 x  924              38944604160                    
> 0       0.0
>  flops 245 x  224 x  924              38944604160                    
> 0       0.0
>  flops   9 x   13 x  256              39773680128                    
> 0       0.0
>  flops  13 x    9 x  256              39898220544                    
> 0       0.0
>  flops 245 x  245 x  924              42595660800                    
> 0       0.0
>  flops 929 x  224 x  929              47943653632                    
> 0       0.0
>  flops   9 x   32 x    9              49089576960                    
> 0       0.0
>  flops 929 x  256 x  256              51628736512                    
> 0       0.0
>  flops 929 x  245 x  929              52438371160                    
> 0       0.0
>  flops 256 x  256 x  929              62344134656                    
> 0       0.0
>  flops 924 x  256 x  224              67398008832                    
> 0       0.0
>  flops 924 x  224 x  256              67398008832                    
> 0       0.0
>  flops 924 x  256 x  245              73716572160                    
> 0       0.0
>  flops 924 x  245 x  256              73716572160                    
> 0       0.0
>  flops  26 x   26 x  256              76928237568                    
> 0       0.0
>  flops 224 x  256 x  924              81386274816                    
> 0       0.0
>  flops 256 x  224 x  924              81386274816                    
> 0       0.0
>  flops 245 x  256 x  924              89016238080                    
> 0       0.0
>  flops 256 x  245 x  924              89016238080                    
> 0       0.0
>  flops 929 x  256 x  929             109585494016                    
> 0       0.0
>  flops  13 x   26 x  256             117583764480                    
> 0       0.0
>  flops  26 x   13 x  256             117812717568                    
> 0       0.0
>  flops  13 x   13 x  256             122457194496                    
> 0       0.0
>  flops  26 x   32 x    9             141814333440                    
> 0       0.0
>  flops   9 x   32 x   26             141814333440                    
> 0       0.0
>  flops 924 x  224 x  929             143056843776                    
> 0       0.0
>  flops 929 x  224 x  924             143056843776                    
> 0       0.0
>  flops 924 x  256 x  256             154052591616                    
> 0       0.0
>  flops 924 x  245 x  929             156468422880                    
> 0       0.0
>  flops 929 x  245 x  924             156468422880                    
> 0       0.0
>  flops 256 x  256 x  924             186025771008                    
> 0       0.0
>  flops   9 x   32 x   13             216568460160                    
> 0       0.0
>  flops  13 x   32 x    9             216568460160                    
> 0       0.0
>  flops 924 x  256 x  929             326987071488                    
> 0       0.0
>  flops 929 x  256 x  924             326987071488                    
> 0       0.0
>  flops  26 x   32 x   26             409685852160                    
> 0       0.0
>  flops 924 x  224 x  924             426860679168                    
> 0       0.0
>  flops 924 x  245 x  924             466878867840                    
> 0       0.0
>  flops  26 x   32 x   13             639867421440                    
> 0       0.0
>  flops  13 x   32 x   26             639867421440                    
> 0       0.0
>  flops  13 x   32 x   13             658982538240                    
> 0       0.0
>  flops 924 x  256 x  924             975681552384                    
> 0       0.0
>  flops total                        9705794368534                    
> 0       0.0
>  marketing flops                   10521420653440
>
> MY input file:
> &GLOBAL
>   PRINT_LEVEL  LOW
>   PROJECT_NAME 1
>   RUN_TYPE  ENERGY
> &END GLOBAL
> &MOTION
>   &GEO_OPT
>     OPTIMIZER  BFGS
>     STEP_START_VAL  1
>   &END GEO_OPT
>   &END MOTION
> &FORCE_EVAL
>   METHOD  QS
>   STRESS_TENSOR  ANALYTICAL
>   &DFT
>     &SCF
>      MAX_SCF  50
>      EPS_SCF   1.0E-7
>     &OT  T
>      MINIMIZER  DIIS
>      ENERGY_GAP  0.002
>      ALGORITHM   IRAC
>      PRECONDITIONER  FULL_ALL
>     &END OT
>     &OUTER_SCF
>      EPS_SCF 1.0E-7
>      MAX_SCF 40
>      STEP_SIZE 0.1
>      EXTRAPOLATION_ORDER 4
>     &END OUTER_SCF
>    &END SCF
>     &QS
>       METHOD  GPW
>     &END QS
>     &MGRID
>       CUTOFF   400
>     &END MGRID
>     &XC
>          &XC_FUNCTIONAL  NO_SHORTCUT
>         &PBE  T
>         &END PBE
>       &END XC_FUNCTIONAL
>     &END XC
>     &POISSON
>      periodic xyz
>      poisson_solver periodic
>     &END POISSON      
>   &END DFT
>   &SUBSYS
>     &CELL
> A 16.707 0.00 0.00
> B 0.00 15.580 0.00
> C 0.00 0.00 30.
>    PERIODIC  XYZ
>    MULTIPLE_UNIT_CELL  1 1 1
>  &END CELL
>  &COORD
>    &END COORD
>
> anyone can help?
>
>
> Jianfeng Jia
>
>
>
>  
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20151120/7b4719a8/attachment.htm>


More information about the CP2K-user mailing list