Some tests on Opteron cluster (strange???)

Axel akoh... at
Fri Apr 18 01:44:34 UTC 2008

On Apr 17, 7:55 am, ilya <ily... at> wrote:
> Hi, All  !!!

> First of all setting OMP_NUM_THREADS=2 and running 1 MPI process per
> node makes performance worse (lines 2, 5 and 6) compared to 2 MPI

a proper hybrid OpenMP/MPI compile is tricky. you have
to compile all of cp2k with OpenMP support _and_ have a
multi-threaded FFT and BLAS/LAPACK. even then the gain
is probably most, when you have scaled out the MPI parallelism.

> processes per node (we have 1 dual core CPU per node).
> Is it normal? CPU time still doubles.

yep. OpenMP comes with a lot of overhead and i have the
feeling the with intel binaries the threads keep spinning
instead of sleeping until they are used.

> And is it a normal scaling behavior? ( 4.89 on 8 procs is not very
> good. I know the system is small but anyway).

please see previous discussions on scaling and dual-core performance.

what kind of interconnect do you have on your cluster?
cp2k is _very_ demanding is terms of both memory bandwidth
and communication bandwidth/latency.

> Thanks.

More information about the CP2K-user mailing list