[CP2K-user] Performance of CP2K for MPT vs IntelMPI

Hans Pabst hf.... at gmail.com
Wed Aug 7 07:18:53 UTC 2019


Hello Chris,

Matthias is right - show it to the admins since it can be also related to 
e.g. the setup of the job scheduler. If you want to make an additional 
experiment, you can set I_MPI_HYDRA_BOOTSTRAP=slurm. I guess you are using 
Slurm?

For CP2K with Intel bits (MPI, MKL, IFORT but also GFortran), I am 
maintaining a recipe for CP2K 7.x and a step-by-step guide for CP2K 6.1 
<https://xconfigure.readthedocs.io/cp2k/#step-by-step-guide> (still the 
latest release). The performance section 
<https://xconfigure.readthedocs.io/cp2k/#performance> also gives some hints 
for tuning iMPI (I_MPI_COLL_INTRANODE=pt2pt, I_MPI_ADJUST_REDUCE=1, 
I_MPI_ADJUST_BCAST=1). This is written equally for InfiniBand and Omnipath, 
and running CP2K <https://xconfigure.readthedocs.io/cp2k/#running-cp2k> 
with Intel MPI/OMP hybrid has its own section (I_MPI_PIN_DOMAIN=auto, 
I_MPI_PIN_ORDER=bunch, OMP_PLACES=threads, OMP_PROC_BIND=SPREAD, 
OMP_NUM_THREADS).

Hans


Am Freitag, 2. August 2019 22:40:24 UTC+2 schrieb Christmas Mwk:
>
> Hi all,
>
> Recently I was trying to run H2O-64 benchmark on CIRRUS. I compiled CP2K 
> 6.1 versions "popt" and "psmp" with GCC 8.2, FFTW3, Libint, libxsmm and MKL 
> 2019.3. For MPI I used MPT 2.18 and IntelMPI 18.0.5.274 (both available as 
> modules on CIRRUS so no problems with infiniband) for comparison. In my 
> surprise, while on single node IntelMPI had better performance (59.5s) 
> compared to what is shown in cirrus-h2o-64 
> <https://www.cp2k.org/performance:cirrus-h2o-64>, when I tried to run it 
> across more than 1 node, eg 4 nodes(144 cores), the runtime wasn't scaling. 
> However in the case of MPT the results were similar to what is published on 
> the web. Observing the output timings, I saw that a significant portion of 
> the time is spent on mp_wait_any and mp_waitall_1 (around 21s out of 60s) 
> in the case of IntelMPI across 4 nodes, while for MPT only around 6s is 
> spent in these routines with overall runtime around 28s.
>
> Initially I suspected that using IntelMPI might require some manual 
> process pinning so I tried various options such as setting I_MPI_PIN_DOMAIN 
> to compact, core etc. While there was some improvement in performance, 
> these overheads in MPI routines were still the same. I also considered 
> IntelMPI 2017 but same performance obtained. Additionally, similar results 
> are obtained for both "popt" and "psmp" with OMP threads set to 1. I assume 
> that if there was a load imbalance issue then performance for both MPT and 
> IntelMPI would have been comparable but still not sure.
>
> Is there anything that I am missing here or is this performance behaviour 
> expected in the case of IntelMPI? If performance should be similar or 
> comparable, could you please suggest how I can launch the executable using 
> mpirun and IntelMPI?
>
> Thank you in advance. Any help would be much appreciated. I attach the 
> arch files (popt files are similar) and run time results for the runs on 4 
> nodes for IntelMPI(compact and core), MPT and also the single node result 
> for IntelMPI. Below, examples of how the executable is launched are also 
> provided.
>
> Best,
> Chris
>
> Mpt
> export OMP_NUM_THREADS=1
> /lustre/sw/cp2k/4.1.17462/cp2k/cp2k/exe/mpt/placement 1
>
> mpiexec_mpt -n 144 -ppn 36 dplace -p place.txt 
> /lustre/home/d167/s1887443/scc/cp2k/exe/broadwell-o2-libs-mpt/cp2k.psmp 
> H2O-64.inp
>
> Compact
> export OMP_NUM_THREADS=1
>
>
> mpirun -n 144 -ppn 36 -env I_MPI_PIN_DOMAIN omp -env I_MPI_PIN_ORDER 
> compact -print-rank-map
>  /lustre/home/d167/s1887443/scc/cp2k/exe/broadwell-o2-libs/cp2k.psmp -i
>  H2O-64.inp -o out.txt
>
> Core
>
> export OMP_NUM_THREADS=1
> export I_MPI_PIN_DOMAIN=core
>
> mpirun -n 144 -ppn 36 -genv I_MPI_PIN_DOMAIN=core -genv OMP_NUM_THREADS=1 
> -print-rank-map
>  /lustre/home/d167/s1887443/scc/cp2k/exe/broadwell-o2-libs/cp2k.psmp -i
>  H2O-64.inp -o out.txt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20190807/14aafc93/attachment.htm>


More information about the CP2K-user mailing list