Some tests on Opteron cluster (strange???)
akoh... at gmail.com
Fri Apr 18 01:44:34 UTC 2008
On Apr 17, 7:55 am, ilya <ily... at gmail.com> wrote:
> Hi, All !!!
> First of all setting OMP_NUM_THREADS=2 and running 1 MPI process per
> node makes performance worse (lines 2, 5 and 6) compared to 2 MPI
a proper hybrid OpenMP/MPI compile is tricky. you have
to compile all of cp2k with OpenMP support _and_ have a
multi-threaded FFT and BLAS/LAPACK. even then the gain
is probably most, when you have scaled out the MPI parallelism.
> processes per node (we have 1 dual core CPU per node).
> Is it normal? CPU time still doubles.
yep. OpenMP comes with a lot of overhead and i have the
feeling the with intel binaries the threads keep spinning
instead of sleeping until they are used.
> And is it a normal scaling behavior? ( 4.89 on 8 procs is not very
> good. I know the system is small but anyway).
please see previous discussions on scaling and dual-core performance.
what kind of interconnect do you have on your cluster?
cp2k is _very_ demanding is terms of both memory bandwidth
and communication bandwidth/latency.
More information about the CP2K-user