[CP2K-user] [CP2K:11823] Re: CP2K performance on GPUs
CNelson
chri... at gmail.com
Mon Jun 3 10:50:40 UTC 2019
Thanks for that Tiziano,
I'll give it a go later today.
kind regards,
Chris
On Friday, 31 May 2019 09:24:51 UTC+1, Tiziano Müller wrote:
> Hi Chris,
>
> an arch/ file for CP2K with P100 GPUs can be found as part of the
> regtester output from Piz Daint here:
>
>
> https://www.cp2k.org/static/regtest/trunk/cscs-daint-xc50_gpu/CRAY_XC50-gfortran_gpu.psmp.out
>
> Those outputs are usually available from here:
>
> https://dashboard.cp2k.org/
>
>
> (click the link in the Status column)
>
> Best regards,
> Tiziano
>
>
> Am 30.05.19 um 10:15 schrieb CNelson:
> > Hi Both,
> > would it be possible to get a copy of the ARCH file you used to build
> > CP2K with the new V100 GPUs?
> > cheers,
> > Chris.
> >
> > On Sunday, 4 November 2018 21:06:12 UTC, Alfio Lazzaro wrote:
> >
> > OK, the best way is if you can attach the arch file, the input file,
> > and the output that you got from CP2K.
> > The only GPU accelerated part in CP2K is DBCSR, but can be that you
> > are bound from something else.
> >
> > I agree with you that the reoptimization is not that important at
> > this stage...
> >
> > Alfio
> >
> >
> > Il giorno domenica 4 novembre 2018 19:13:02 UTC+1, fo... at gmail.com
> > ha scritto:
> >
> > Thanks Alfio for the response.
> >
> > Yes. 8 V100 GPUs is extreme. The test I had used takes around
> > 500 seconds on a system with Intel SKL G-6148 40 cores(20
> > cores/socket). Do you think this test is not large enough to run
> > on GPUs? If yes, can you recommend any test from CP2K tests
> folder?
> >
> > I had tried runs with 1 & 2 V100 gpus also. The performance was
> > slower than the 8 V100 gpus run.
> >
> > CP2K was able to recognize all the 8 gpus, as per "DBCSR| ACC:
> > Number of devices/node".
> >
> > I had tried reoptimizing the kernels for V100. But could not
> > determine what block size values have to be passed to tune.py
> > script.
> >
> > As CP2K-6.1 already has optimized kernel parameters for P100,
> > even 2xP100 GPUs run was slower than CPU only benchmark.
> >
> > On Sunday, November 4, 2018 at 2:33:11 PM UTC+5:30, Alfio
> > Lazzaro wrote:
> >
> > You may take a look at this issue on
> > github: https://github.com/cp2k/cp2k/issues/73
> > <https://github.com/cp2k/cp2k/issues/73>
> >
> > In your particular case, your setup of 8 V100 is pretty
> > extreme and it would require a large computation. Which test
> > are you using for benchmarking?
> >
> > Then, your setup of 8 ranks + 5 threads should be OK. CP2K
> > attaches ranks to GPU in a round-robin manner, therefore in
> > your case there is a rank talking to each GPU.
> > We don't have a large experience of multi-gpu nodes, hence I
> > would suggest to do some scalability test by running 1 rank,
> > 2 ranks, ... 8 ranks (always 5 threads) to check how the
> > performance scales. BTW, make sure CP2K is able to recognize
> > 8 GPUs by checking the following output at the beginning:
> >
> > DBCSR| ACC: Number of devices/node
> > 1
> >
> > Eventually, you might consider reoptimizing the kernels for
> > the V100, but this is not a priority...
> >
> > Alfio
> >
> >
> >
> > Il giorno sabato 3 novembre 2018 07:55:09 UTC+1,
> > fo... at gmail.com ha scritto:
> >
> > HI,
> >
> > How is the CP2K performance on GPUs in general?
> >
> > I'm getting very low performance on GPUs(Nvidia V100
> > SXM2). It is a single node benchmark with 8 GPUs and
> > Intel Skylake Gold 6148 dual processors.
> >
> > The CP2K time on 8 GPUs (CP2K-6.1 psmp version,
> > ifort-2017, CUDA-9.2, 8mpi ranks + 5 threads per rank)
> > is still slower than CP2K time of CPU only benchmark.
> >
> > For CPU runs, the CP2K-6.1 is built with LIBXSMM-1.8.3.
> >
> > For GPU runs, have tried both with and without LIBXSMM.
> > There is no performance difference. But both's
> > performance is still slower than CPU only benchmark even
> > after using all the 8 GPUs & all 40 cores of CPU. Can
> > some one please share their experience on CP2K
> > performance with GPUs.
> >
> > The CUDA specific DFLAGS used are: -D__ACC -D__DBCSR_ACC
> > -D__PW_CUDA.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "cp2k" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to c... at googlegroups.com <javascript:>
> > <mailto:c... at googlegroups.com <javascript:>>.
> > To post to this group, send email to c... at googlegroups.com
> <javascript:>
> > <mailto:c... at googlegroups.com <javascript:>>.
> > Visit this group at https://groups.google.com/group/cp2k.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/cp2k/4920c538-3d63-4754-8dc3-76396262d543%40googlegroups.com
> > <
> https://groups.google.com/d/msgid/cp2k/4920c538-3d63-4754-8dc3-76396262d543%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> Tiziano Müller
> University of Zurich
> Department of Chemistry
> Winterthurerstrasse 190
> CH-8057 Zürich
>
> Tel: +41 44 63 54234
> www.chem.uzh.ch
> tiz... at chem.uzh.ch <javascript:>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20190603/522b8fcd/attachment.htm>
More information about the CP2K-user
mailing list