[CP2K-user] cp2k on 10 GbE
Anton Kudelin
archm... at gmail.com
Thu Nov 29 10:51:09 UTC 2018
Dear Peter,
Please, specify your hardware and attach input/output files you're testing.
I also recommend to completely disable hyperthreading on the BIOS level.
There are at least two reasons to do it: 1) cp2k as many other HPC programs
has no profit from this technology, 2) as it shown by the recent study
<http://ia.cr/2018/1060> HT is not safe for multiuser systems such as
clusters, servers, etc, to which your system belongs, I guess.
Best wishes,
Anton K.
четверг, 29 ноября 2018 г., 13:20:43 UTC+3 пользователь Peter Kraus написал:
>
> Dear Anton,
>
> thanks for the suggestion. MPICH 3.3 seems quicker than OpenMPI 3.1, as on
> 16 MPI instances with 8 OpenMP threads each (128 cores total), it takes
> ~130 s per wavefunction optimisation step, while OpenMPI takes ~200 s.
> However, with OpenMPI running with 8x8 parallelisation (64 cores, fits into
> one of my hyper-threaded nodes), I get ~7 s per step, so the MPI penalty is
> still ridiculous. This is for a V2O5 bulk system with 168 atoms, PBE and DZ
> basis set.
>
> Best,
> Peter
>
> On Wednesday, 28 November 2018 13:17:27 UTC+1, Anton Kudelin wrote:
>>
>> Try to employ MPICH or its derivatives (MVAPICH) configured with
>> --with-device=ch3:nemesis
>>
>> среда, 28 ноября 2018 г., 14:35:04 UTC+3 пользователь Peter Kraus написал:
>>>
>>> Dear Mike,
>>>
>>> I have tried to use CP2K on our cluster with nodes connected using 10
>>> GbE, and all I see is a very significant slowdown. This was using
>>> gcc-8.2.0, openmpi-3.1.1 and OpenBLAS/fftw/scalapack compiled using the two
>>> with OpenMP enabled where possible. I've resorted to submitting "SMP"-like
>>> jobs (by selecting the smp parallel environment, but parallelising using
>>> both MPI and OpenMP).
>>>
>>> If you figure out how to squeeze extra performance from the 10GbE,
>>> please let me know.
>>>
>>> Best,
>>> Peter
>>>
>>> On Monday, 12 November 2018 18:01:48 UTC+1, Mike Ruggiero wrote:
>>>>
>>>> Hello cp2k community - I have recently setup a small computing cluster,
>>>> with 20-24 core server nodes linked via 10 GbE connections. While scaling
>>>> on single nodes is as it should be (i.e., nearly linear), I get very
>>>> little-to no scale up when performing multiple node simulations. After
>>>> digging around, it seems that this is relatively well known for cp2k, but
>>>> I'm curious if anyone has had any success on using cp2k over 10 GbE
>>>> connections. Any advice would be greatly appreciated!
>>>>
>>>> Best,
>>>> Michael Ruggiero
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20181129/5a7f89bf/attachment.htm>
More information about the CP2K-user
mailing list