[CP2K-user] [CP2K:13235] CP2K 7.1-Cuda Bandgap and HF energies different from previous versions
Thomas Kühne
tku... at gmail.com
Mon May 4 16:27:42 UTC 2020
Dear Leopold,
to the best of my knowledge all other machines except for Piz Daint, where the "no
GPU" comment is present, are not equipped with GPUs, so everything is consistent.
Cheers,
Thomas
> Am 04.05.2020 um 18:12 schrieb Leopold Talirz <leopol... at gmail.com>:
>
> Dear Fabian,
>
> thanks a lot for checking and for pinning down the issue.
>
> Since this is a rather serious issue, my first instinct was to check on the performance page of cp2k to see whether CUDA + OMP was ever used in benchmark studies.
> https://www.cp2k.org/performance <https://www.cp2k.org/performance>
>
> Unfortunately, it is not clear to me from the page - something I now remember to have run in before:
> E.g. for some systems it says explicitly "no GPU" but for others that can have a GPU (like Cray XC40) it does not say it and it is not clear whether this means the GPU was used or not.
> May I suggest to the maintainer of this page to make this information explicit?
>
> And if it turns out that there are currently no tests including the CUDA version on the list, perhaps it would make sense to include some?
>
> Best wishes from Bern,
> Leopold
>
>
>
>
> On Monday, 4 May 2020 17:35:08 UTC+2, Fabian Ducry wrote:
> Dear Andres,
>
> I can confirm and reproduce the issue. Apparently it appears when combining CUDA + OMP in hybrid calculations. In that case the energy becomes a function of #OMP threads per rank. For your input I got (cp2k 8.0, revision 3e7b916, run on Piz Daint)
> no cuda OMP_NUM_THREADS = 1 OMP_NUM_THREADS = 3 OMP_NUM_THREADS = 6
> Exchange-correlation energy: -433.84964308969535 -433.84964308969302 -435.33426106395467 -435.96513615032325
> Hartree-Fock Exchange energy: -127.87395928499694 -127.87395928499325 -125.97109874333140 -125.24809389970088
> Total energy: -1976.39722899739672 -1976.39722899739013 -1975.95046919253809 -1975.87080541858177
>
> Without OMP parallelization the energies agrees with the calculation without CUDA accelleration. Increasing OMP_NUM_THREADS beyond 1 increases the Hartree-Fock Exchange energy.
> Apparently you have to disable OMP to obtain correct results. This is obviously not very satisfying and I hope this gets fixed. I see that you used 1 MPI/12 OMP ranks per node. Try increasing the number of MPI ranks per node. To do so you have to set
> export CRAY_CUDA_MPS=1 in the submission script.
>
> I hope this helps.
>
> Best,
> Fabian
>
> --
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cp... at googlegroups.com <mailto:cp... at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/048030dd-6532-4fc5-b127-b7a5b017dad7%40googlegroups.com <https://groups.google.com/d/msgid/cp2k/048030dd-6532-4fc5-b127-b7a5b017dad7%40googlegroups.com?utm_medium=email&utm_source=footer>.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20200504/e560939f/attachment.htm>
More information about the CP2K-user
mailing list