comparison of psmp and popt (with and without openmp)
Ronald Cohen
rco... at carnegiescience.edu
Wed Mar 23 20:28:19 UTC 2016
So I finally got decent performance with gfortran, openmpi, and openblas
across inifiniband. Now I find that the use of openmp and
half the number of mpi processes seems to give better performance for the
64 molecule H2O test case. Is that reasonable? I recompiled everything
including BLAS, scalapack, etc without -fopenmp etc. to make the popt
version.
I find in seconds:
1 node 16 MPI procs psmp OMP_NUM_THREADS=1 834
1 node 16 MPI procs popt OMP_NUM_THREADS=1 836
2 nodes 16 MPI procs psmp OMP_NUM_THREADS=2 266
2 nodes 32 MPI procs popt OMP_NUM_THREADS=1 430
4 nodes 64 MPI procs popt OMP_NUM_THREADS=1 331
4 nodes 32 MPI procs psmp OMP_NUM_THREADS=2 189
4 nodes 64 MPI procs psmp OMP_NUM_THREADS=4 166
So you see there is no overhead using psmp built with openmp and setting
threads to 1.
Using OMP THREADS greatly improves performance over just increasing mpi
processes
This may be because this machine has only 1 GB memory per core, but even 4
threads is better than 2, so it seems openmp
is more efficient than mpi.
Still room for improvement though. Any ideas of how to tweak out better
performance?
Ron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20160323/c061e1f6/attachment.htm>
More information about the CP2K-user
mailing list