[CP2K-user] cp2k on 10 GbE

Anton Kudelin archm... at gmail.com
Thu Nov 29 10:51:09 UTC 2018


Dear Peter,

Please, specify your hardware and attach input/output files you're testing.
I also recommend to completely disable hyperthreading on the BIOS level. 
There are at least two reasons to do it: 1) cp2k as many other HPC programs 
has no profit from this technology, 2) as it shown by the recent study 
<http://ia.cr/2018/1060> HT is not safe for multiuser systems such as 
clusters, servers, etc, to which your system belongs, I guess.

Best wishes,
Anton K.

четверг, 29 ноября 2018 г., 13:20:43 UTC+3 пользователь Peter Kraus написал:
>
> Dear Anton,
>
> thanks for the suggestion. MPICH 3.3 seems quicker than OpenMPI 3.1, as on 
> 16 MPI instances with 8 OpenMP threads each (128 cores total), it takes 
> ~130 s per wavefunction optimisation step, while OpenMPI takes ~200 s. 
> However, with OpenMPI running with 8x8 parallelisation (64 cores, fits into 
> one of my hyper-threaded nodes), I get ~7 s per step, so the MPI penalty is 
> still ridiculous. This is for a V2O5 bulk system with 168 atoms, PBE and DZ 
> basis set.
>
> Best,
> Peter
>
> On Wednesday, 28 November 2018 13:17:27 UTC+1, Anton Kudelin wrote:
>>
>> Try to employ MPICH or its derivatives (MVAPICH) configured with 
>> --with-device=ch3:nemesis
>>
>> среда, 28 ноября 2018 г., 14:35:04 UTC+3 пользователь Peter Kraus написал:
>>>
>>> Dear Mike,
>>>
>>> I have tried to use CP2K on our cluster with nodes connected using 10 
>>> GbE, and all I see is a very significant slowdown. This was using 
>>> gcc-8.2.0, openmpi-3.1.1 and OpenBLAS/fftw/scalapack compiled using the two 
>>> with OpenMP enabled where possible. I've resorted to submitting "SMP"-like 
>>> jobs (by selecting the smp parallel environment, but parallelising using 
>>> both MPI and OpenMP). 
>>>
>>> If you figure out how to squeeze extra performance from the 10GbE, 
>>> please let me know.
>>>
>>> Best,
>>> Peter
>>>
>>> On Monday, 12 November 2018 18:01:48 UTC+1, Mike Ruggiero wrote:
>>>>
>>>> Hello cp2k community - I have recently setup a small computing cluster, 
>>>> with 20-24 core server nodes linked via 10 GbE connections. While scaling 
>>>> on single nodes is as it should be (i.e., nearly linear), I get very 
>>>> little-to no scale up when performing multiple node simulations. After 
>>>> digging around, it seems that this is relatively well known for cp2k, but 
>>>> I'm curious if anyone has had any success on using cp2k over 10 GbE 
>>>> connections. Any advice would be greatly appreciated! 
>>>>
>>>> Best,
>>>> Michael Ruggiero  
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20181129/5a7f89bf/attachment.htm>


More information about the CP2K-user mailing list