<div dir="ltr">Hi Jianfeng,<br><br>try replacing  -D__DBCSR_CUDA  with  -D__DBCSR_ACC. That might just do the trick.<br><br>-Ole<br><br>Am Freitag, 20. November 2015 13:04:25 UTC+1 schrieb jjf...@yahoo.com.cn:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial"><div>Dear all,<br>I have compiled my CP2K with:<br>DFLAGS   = -D__INTEL -D__FFTSG  -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3 -D__FFTMKL  -D__LIBINT -D__LIBXC2 -D__ACC -D__CUDAPW -D__DBCSR_CUDA  -D__LIBINT_MAX_AM=6  -D__LIBDERIV_MAX_AM1=5<br><br>I have revised the generate.py as following:<br>    triples  = combinations(23)              <wbr>   # blocked H2O (benchmark)<br>    triples += combinations(6)               <wbr>   # idem min basis<br>    triples += combinations(14,16,29)        <wbr>   # RPA water<br>    triples += combinations(5, 32, 13, 24, 26)<br>    triples += combinations(9, 32, 22)<br>    triples += combinations(32)<br>    triples += combinations(64)<br>    triples += combinations(78)<br>    triples += combinations(16,29,55)<br>    triples += combinations(13,32,13)<br>    triples += combinations(26,32,13)<br>    triples += combinations(13,32,26)<br>    triples += combinations(26,32,26)<br>    triples += combinations(13,32,9)<br>    triples += combinations(9,32,13)<br>    triples += combinations(26,32,9)<br>    triples += combinations(9,32,26)<br><br>However, I don't find the acceleration from MY GPU (K20m). The DBCSR STATISTICS was as following:<br><br> COUNTER                      <wbr>                CPU                  ACC      ACC%<br> number of processed stacks                388804                    0       0.0<br> matmuls inhomo. stacks                  12328621                    0       0.0<br> matmuls total                          242005257                    0       0.0<br> flops   9 x   21 x    9               1073834496                    0       0.0<br> flops 224 x  224 x  224               1753350144                    0       0.0<br> flops 224 x  224 x  245               1917726720                    0       0.0<br> flops 224 x  245 x  224               1917726720                    0       0.0<br> flops 245 x  224 x  224               1917726720                    0       0.0<br> flops 224 x  245 x  245               2097513600                    0       0.0<br> flops 245 x  224 x  245               2097513600                    0       0.0<br> flops 245 x  245 x  224               2097513600                    0       0.0<br> flops 245 x  245 x  245               2294155500                    0       0.0<br> flops   9 x    9 x  213               2556480528                    0       0.0<br> flops  26 x   21 x    9               3102188544                    0       0.0<br> flops   9 x   21 x   26               3102188544                    0       0.0<br> flops 224 x  224 x  256               4007657472                    0       0.0<br> flops 224 x  256 x  224               4007657472                    0       0.0<br> flops 256 x  224 x  224               4007657472                    0       0.0<br> flops 224 x  245 x  256               4383375360                    0       0.0<br> flops 224 x  256 x  245               4383375360                    0       0.0<br> flops 245 x  224 x  256               4383375360                    0       0.0<br> flops 245 x  256 x  224               4383375360                    0       0.0<br> flops 256 x  224 x  245               4383375360                    0       0.0<br> flops 256 x  245 x  224               4383375360                    0       0.0<br> flops   9 x   21 x   13               4737435066                    0       0.0<br> flops  13 x   21 x    9               4737435066                    0       0.0<br> flops 245 x  245 x  256               4794316800                    0       0.0<br> flops 245 x  256 x  245               4794316800                    0       0.0<br> flops 256 x  245 x  245               4794316800                    0       0.0<br> flops  26 x    9 x  213               7222105800                    0       0.0<br> flops   9 x   26 x  213               7247226168                    0       0.0<br> flops  26 x   21 x   26               8961878016                    0       0.0<br> flops 256 x  256 x  224               9160359936                    0       0.0<br> flops 224 x  256 x  256               9160359936                    0       0.0<br> flops 256 x  224 x  256               9160359936                    0       0.0<br> flops   9 x    9 x  256               9217732608                    0       0.0<br> flops 929 x  224 x  224               9882062848                    0       0.0<br> flops 256 x  256 x  245              10019143680                    0       0.0<br> flops 245 x  256 x  256              10019143680                    0       0.0<br> flops 256 x  245 x  256              10019143680                    0       0.0<br> flops 929 x  224 x  245              10808506240                    0       0.0<br> flops 929 x  245 x  224              10808506240                    0       0.0<br> flops   9 x   13 x  213              11030981598                    0       0.0<br> flops  13 x    9 x  213              11065522104                    0       0.0<br> flops 929 x  245 x  245              11821803700                    0       0.0<br> flops 224 x  224 x  929              11933057024                    0       0.0<br> flops 224 x  245 x  929              13051781120                    0       0.0<br> flops 245 x  224 x  929              13051781120                    0       0.0<br> flops  26 x   21 x   13              13997099844                    0       0.0<br> flops  13 x   21 x   26              13997099844                    0       0.0<br> flops 245 x  245 x  929              14275385600                    0       0.0<br> flops  13 x   21 x   13              14415243024                    0       0.0<br> flops 256 x  256 x  256              20937965568                    0       0.0<br> flops  26 x   26 x  213              21335565888                    0       0.0<br> flops 929 x  224 x  256              22587572224                    0       0.0<br> flops 929 x  256 x  224              22587572224                    0       0.0<br> flops 929 x  245 x  256              24705157120                    0       0.0<br> flops 929 x  256 x  245              24705157120                    0       0.0<br> flops  26 x    9 x  256              26040268800                    0       0.0<br> flops   9 x   26 x  256              26130843648                    0       0.0<br> flops 224 x  256 x  929              27275558912                    0       0.0<br> flops 256 x  224 x  929              27275558912                    0       0.0<br> flops 924 x  224 x  224              29486628864                    0       0.0<br> flops 245 x  256 x  929              29832642560                    0       0.0<br> flops 256 x  245 x  929              29832642560                    0       0.0<br> flops 924 x  224 x  245              32251000320                    0       0.0<br> flops 924 x  245 x  224              32251000320                    0       0.0<br> flops  13 x   26 x  213              32611122180                    0       0.0<br> flops  26 x   13 x  213              32674620888                    0       0.0<br> flops  13 x   13 x  213              33962737536                    0       0.0<br> flops 924 x  245 x  245              35274531600                    0       0.0<br> flops 224 x  224 x  924              35606495232                    0       0.0<br> flops 224 x  245 x  924              38944604160                    0       0.0<br> flops 245 x  224 x  924              38944604160                    0       0.0<br> flops   9 x   13 x  256              39773680128                    0       0.0<br> flops  13 x    9 x  256              39898220544                    0       0.0<br> flops 245 x  245 x  924              42595660800                    0       0.0<br> flops 929 x  224 x  929              47943653632                    0       0.0<br> flops   9 x   32 x    9              49089576960                    0       0.0<br> flops 929 x  256 x  256              51628736512                    0       0.0<br> flops 929 x  245 x  929              52438371160                    0       0.0<br> flops 256 x  256 x  929              62344134656                    0       0.0<br> flops 924 x  256 x  224              67398008832                    0       0.0<br> flops 924 x  224 x  256              67398008832                    0       0.0<br> flops 924 x  256 x  245              73716572160                    0       0.0<br> flops 924 x  245 x  256              73716572160                    0       0.0<br> flops  26 x   26 x  256              76928237568                    0       0.0<br> flops 224 x  256 x  924              81386274816                    0       0.0<br> flops 256 x  224 x  924              81386274816                    0       0.0<br> flops 245 x  256 x  924              89016238080                    0       0.0<br> flops 256 x  245 x  924              89016238080                    0       0.0<br> flops 929 x  256 x  929             109585494016                  <wbr>  0       0.0<br> flops  13 x   26 x  256             117583764480                  <wbr>  0       0.0<br> flops  26 x   13 x  256             117812717568                  <wbr>  0       0.0<br> flops  13 x   13 x  256             122457194496                  <wbr>  0       0.0<br> flops  26 x   32 x    9             141814333440                  <wbr>  0       0.0<br> flops   9 x   32 x   26             141814333440                  <wbr>  0       0.0<br> flops 924 x  224 x  929             143056843776                  <wbr>  0       0.0<br> flops 929 x  224 x  924             143056843776                  <wbr>  0       0.0<br> flops 924 x  256 x  256             154052591616                  <wbr>  0       0.0<br> flops 924 x  245 x  929             156468422880                  <wbr>  0       0.0<br> flops 929 x  245 x  924             156468422880                  <wbr>  0       0.0<br> flops 256 x  256 x  924             186025771008                  <wbr>  0       0.0<br> flops   9 x   32 x   13             216568460160                  <wbr>  0       0.0<br> flops  13 x   32 x    9             216568460160                  <wbr>  0       0.0<br> flops 924 x  256 x  929             326987071488                  <wbr>  0       0.0<br> flops 929 x  256 x  924             326987071488                  <wbr>  0       0.0<br> flops  26 x   32 x   26             409685852160                  <wbr>  0       0.0<br> flops 924 x  224 x  924             426860679168                  <wbr>  0       0.0<br> flops 924 x  245 x  924             466878867840                  <wbr>  0       0.0<br> flops  26 x   32 x   13             639867421440                  <wbr>  0       0.0<br> flops  13 x   32 x   26             639867421440                  <wbr>  0       0.0<br> flops  13 x   32 x   13             658982538240                  <wbr>  0       0.0<br> flops 924 x  256 x  924             975681552384                  <wbr>  0       0.0<br> flops total                        9705794368534                 <wbr>   0       0.0<br> marketing flops                   10521420653440<br><br>MY input file:<br>&GLOBAL<br>  PRINT_LEVEL  LOW<br>  PROJECT_NAME 1<br>  RUN_TYPE  ENERGY<br>&END GLOBAL<br>&MOTION<br>  &GEO_OPT<br>    OPTIMIZER  BFGS<br>    STEP_START_VAL  1<br>  &END GEO_OPT<br>  &END MOTION<br>&FORCE_EVAL<br>  METHOD  QS<br>  STRESS_TENSOR  ANALYTICAL<br>  &DFT<br>    &SCF<br>     MAX_SCF  50<br>     EPS_SCF   1.0E-7<br>    &OT  T<br>     MINIMIZER  DIIS<br>     ENERGY_GAP  0.002<br>     ALGORITHM   IRAC<br>     PRECONDITIONER  FULL_ALL<br>    &END OT<br>    &OUTER_SCF<br>     EPS_SCF 1.0E-7<br>     MAX_SCF 40<br>     STEP_SIZE 0.1<br>     EXTRAPOLATION_ORDER 4<br>    &END OUTER_SCF<br>   &END SCF<br>    &QS<br>      METHOD  GPW<br>    &END QS<br>    &MGRID<br>      CUTOFF   400<br>    &END MGRID<br>    &XC<br>         &XC_FUNCTIONAL  NO_SHORTCUT<br>        &PBE  T<br>        &END PBE<br>      &END XC_FUNCTIONAL<br>    &END XC<br>    &POISSON<br>     periodic xyz<br>     poisson_solver periodic<br>    &END POISSON      <br>  &END DFT<br>  &SUBSYS<br>    &CELL<br>A 16.707 0.00 0.00<br>B 0.00 15.580 0.00<br>C 0.00 0.00 30.<br>   PERIODIC  XYZ<br>   MULTIPLE_UNIT_CELL  1 1 1<br> &END CELL<br> &COORD<br>   &END COORD<br><br>anyone can help?<br><br><br>Jianfeng Jia<br><br></div></div><br><br><span title="neteasefooter"><p> </p></span></blockquote></div>