mpich? problems on a linux cluster

Axel akoh... at
Fri Dec 7 16:38:35 CET 2007


one more thing that may be important: what interconnect
do you have and is it working correctly under high load?

cp2k is very demanding and i've run across multiple machines
(myrinet/infiniband) where the MPI runtime settings needed to
be tweaked to have the job run reliably. i suggest you log into
the failing node and have a look at the kernel message buffer
with "dmesg" and see if there is anything suspicious.

the second option when you see segmentation faults with intel
compilers is the lack of sufficient stack size. for historical
reasons, the intel fortran frontend allocates temporary arrays
by default on the stack instead of the heap. please check your
cluster nodes for whether the stack segment is large enough
(ulimit -a), and have the sysadmins increase it if needed.

a second option is to reset the stack size from within cp2k, but
that requires some (ugly?) modifications of the code and they need
to be in c. i'll put an updated version of those into the files
section later.

the third options is to use the -heap-arrays flag, which is only
supported by intel compilers 10.0 and later.

hope that helps,

On Dec 7, 7:59 am, "carlo antonio pignedoli" <c.pig... at>
> Ciao Teo,
> we are using the cmkl libraries
> intel clustertoolkit for linux
> version 9.1

