[CP2K-user] [CP2K:13235] Re: CP2K 7.1-Cuda Bandgap and HF energies different from previous versions

Ole Schütt o... at schuett.name
Mon May 4 16:26:36 UTC 2020


Hi Leopold,

I agree that this is a serious issue that should have been caught by our 
testing.
AFAIK, the performance tests on the dashboard do not check the computed 
results.

However, we do have a daily CUDA regtest: 
https://dashboard.cp2k.org/archive/cuda-pascal/index.html
It uses two OpenMP threads, which might not be enough to exceed the 
tolerance thresholds?

-Ole

On 2020-05-04 18:12, Leopold Talirz wrote:
> Dear Fabian,
> 
> thanks a lot for checking and for pinning down the issue.
> 
> Since this is a rather serious issue, my first instinct was to check
> on the performance page of cp2k to see whether CUDA + OMP was ever
> used in benchmark studies.
> https://www.cp2k.org/performance
> 
> Unfortunately, it is not clear to me from the page - something I now
> remember to have run in before:
> E.g. for some systems it says explicitly "no GPU" but for others that
> can have a GPU (like Cray XC40) it does not say it and it is not clear
> whether this means the GPU was used or not.
> May I suggest to the maintainer of this page to make this information
> explicit?
> 
> And if it turns out that there are currently no tests including the
> CUDA version on the list, perhaps it would make sense to include some?
> 
> Best wishes from Bern,
> Leopold
> 
> On Monday, 4 May 2020 17:35:08 UTC+2, Fabian Ducry wrote:
> 
>> Dear Andres,
>> 
>> I can confirm and reproduce the issue. Apparently it appears when
>> combining CUDA + OMP in hybrid calculations. In that case the energy
>> becomes a function of #OMP threads per rank. For your input I got
>> (cp2k 8.0, revision 3e7b916, run on Piz Daint)
>> 
>> no
>> cuda                      OMP_NUM_THREADS = 1
>> OMP_NUM_THREADS = 3          OMP_NUM_THREADS = 6
>> Exchange-correlation energy:          -433.84964308969535
>> -433.84964308969302                -435.33426106395467
>> -435.96513615032325
>> Hartree-Fock Exchange energy:      -127.87395928499694
>> -127.87395928499325                -125.97109874333140
>> -125.24809389970088
>> Total energy:                                -1976.39722899739672
>> -1976.39722899739013              -1975.95046919253809
>> -1975.87080541858177
>> 
>> Without OMP parallelization the energies agrees with the calculation
>> without CUDA accelleration. Increasing OMP_NUM_THREADS beyond 1
>> increases the Hartree-Fock Exchange energy.
>> Apparently you have to disable OMP to obtain correct results. This
>> is obviously not very satisfying and I hope this gets fixed. I see
>> that you used 1 MPI/12 OMP ranks per node. Try increasing the number
>> of MPI ranks per node. To do so you have to set
>> 
>> export CRAY_CUDA_MPS=1 in the submission script.
>> 
>> I hope this helps.
>> 
>> Best,
>> Fabian
> 
>  --
> You received this message because you are subscribed to the Google
> Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to cp... at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/048030dd-6532-4fc5-b127-b7a5b017dad7%40googlegroups.com
> [1].
> 
> 
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/cp2k/048030dd-6532-4fc5-b127-b7a5b017dad7%40googlegroups.com?utm_medium=email&utm_source=footer



More information about the CP2K-user mailing list