[CP2K-user] [CP2K:11823] Re: CP2K performance on GPUs

CNelson chri... at gmail.com
Mon Jun 3 10:50:40 UTC 2019


Thanks for that Tiziano,
I'll give it a go later today.
kind regards,
Chris

On Friday, 31 May 2019 09:24:51 UTC+1, Tiziano Müller wrote:

> Hi Chris, 
>
> an arch/ file for CP2K with P100 GPUs can be found as part of the 
> regtester output from Piz Daint here: 
>
>
> https://www.cp2k.org/static/regtest/trunk/cscs-daint-xc50_gpu/CRAY_XC50-gfortran_gpu.psmp.out 
>
> Those outputs are usually available from here: 
>
>   https://dashboard.cp2k.org/ 
>
>
> (click the link in the Status column) 
>
> Best regards, 
> Tiziano 
>
>
> Am 30.05.19 um 10:15 schrieb CNelson: 
> > Hi Both, 
> > would it be possible to get a copy of the ARCH file you used to build 
> > CP2K with the new V100 GPUs? 
> > cheers, 
> > Chris. 
> > 
> > On Sunday, 4 November 2018 21:06:12 UTC, Alfio Lazzaro wrote: 
> > 
> >     OK, the best way is if you can attach the arch file, the input file, 
> >     and the output that you got from CP2K. 
> >     The only GPU accelerated part in CP2K is DBCSR, but can be that you 
> >     are bound from something else. 
> > 
> >     I agree with you that the reoptimization is not that important at 
> >     this stage... 
> > 
> >     Alfio 
> > 
> > 
> >     Il giorno domenica 4 novembre 2018 19:13:02 UTC+1, fo... at gmail.com 
> >     ha scritto: 
> > 
> >         Thanks Alfio for the response. 
> > 
> >         Yes. 8 V100 GPUs is extreme. The test I had used takes around 
> >         500 seconds on a system with Intel SKL G-6148 40 cores(20 
> >         cores/socket). Do you think this test is not large enough to run 
> >         on GPUs? If yes, can you recommend any test from CP2K tests 
> folder? 
> > 
> >         I had tried runs with 1 & 2 V100 gpus also. The performance was 
> >         slower than the 8 V100 gpus run.  
> > 
> >         CP2K was able to recognize all the 8 gpus, as per "DBCSR| ACC: 
> >         Number of devices/node". 
> > 
> >         I had tried reoptimizing the kernels for V100. But could not 
> >         determine what block size values have to be passed to tune.py 
> >         script. 
> > 
> >         As CP2K-6.1 already has optimized kernel parameters for P100, 
> >         even 2xP100 GPUs run was slower than CPU only benchmark. 
> > 
> >         On Sunday, November 4, 2018 at 2:33:11 PM UTC+5:30, Alfio 
> >         Lazzaro wrote: 
> > 
> >             You may take a look at this issue on 
> >             github: https://github.com/cp2k/cp2k/issues/73 
> >             <https://github.com/cp2k/cp2k/issues/73> 
> > 
> >             In your particular case, your setup of 8 V100 is pretty 
> >             extreme and it would require a large computation. Which test 
> >             are you using for benchmarking? 
> > 
> >             Then, your setup of 8 ranks + 5 threads should be OK. CP2K 
> >             attaches ranks to GPU in a round-robin manner, therefore in 
> >             your case there is a rank talking to each GPU. 
> >             We don't have a large experience of multi-gpu nodes, hence I 
> >             would suggest to do some scalability test by running 1 rank, 
> >             2 ranks, ... 8 ranks (always 5 threads) to check how the 
> >             performance scales. BTW, make sure CP2K is able to recognize 
> >             8 GPUs by checking the following output at the beginning: 
> > 
> >              DBCSR| ACC: Number of devices/node                          
> >                               1 
> > 
> >             Eventually, you might consider reoptimizing the kernels for 
> >             the V100, but this is not a priority... 
> > 
> >             Alfio 
> > 
> > 
> > 
> >             Il giorno sabato 3 novembre 2018 07:55:09 UTC+1, 
> >             fo... at gmail.com ha scritto: 
> > 
> >                 HI, 
> > 
> >                 How is the CP2K performance on GPUs in general? 
> > 
> >                 I'm getting very low performance on GPUs(Nvidia V100 
> >                 SXM2). It is a single node benchmark with 8 GPUs and 
> >                 Intel Skylake Gold 6148 dual processors.  
> > 
> >                 The CP2K time on 8 GPUs (CP2K-6.1 psmp version, 
> >                 ifort-2017, CUDA-9.2, 8mpi ranks + 5 threads per rank) 
> >                 is still slower than CP2K time of CPU only benchmark. 
> > 
> >                 For CPU runs, the CP2K-6.1 is built with LIBXSMM-1.8.3. 
> > 
> >                 For GPU runs, have tried both with and without LIBXSMM. 
> >                 There is no performance difference. But both's 
> >                 performance is still slower than CPU only benchmark even 
> >                 after using all the 8 GPUs & all 40 cores of CPU. Can 
> >                 some one please share their experience on CP2K 
> >                 performance with GPUs. 
> > 
> >                 The CUDA specific DFLAGS used are: -D__ACC -D__DBCSR_ACC 
> >                 -D__PW_CUDA. 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> > Groups "cp2k" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> > an email to c... at googlegroups.com <javascript:> 
> > <mailto:c... at googlegroups.com <javascript:>>. 
> > To post to this group, send email to c... at googlegroups.com 
> <javascript:> 
> > <mailto:c... at googlegroups.com <javascript:>>. 
> > Visit this group at https://groups.google.com/group/cp2k. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/cp2k/4920c538-3d63-4754-8dc3-76396262d543%40googlegroups.com 
> > <
> https://groups.google.com/d/msgid/cp2k/4920c538-3d63-4754-8dc3-76396262d543%40googlegroups.com?utm_medium=email&utm_source=footer>. 
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Tiziano Müller 
> University of Zurich 
> Department of Chemistry 
> Winterthurerstrasse 190 
> CH-8057 Zürich 
>
> Tel: +41 44 63 54234 
> www.chem.uzh.ch 
> tiz... at chem.uzh.ch <javascript:> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20190603/522b8fcd/attachment.htm>


More information about the CP2K-user mailing list