Increasing nb of cores per node degrade drastically the performance of cp2k.popt

> TRACE revealed that qs_forces,  qs_energies_scf,  scf_env_do_scf,   
> velocity_verlet,  qs_forces,  qs_energies_scf,   scf_env_do_scf and others 
> are 5 times slower in 2x8 than 2x2
> Do you have any suggections and ideas why it's happened?

this happens a lot with CPUs that have very limited memory bandwith.
the quickstep algorithm is very demanding in terms of memory bandwidth.

> CP2K version 2.2.262
> the lib is MKL-Scalapack
> the system is cluster of XEON nodes (8cores/node) with Infiniband switch

what type of xeon processors exactly? that makes all the difference.
the 56xx (westmere) and 55xx (nehalem) series ones for example have
_much_ more memory bandwidth than 54xx (harpertown) series ones.



> the compiler is Intel's
