cp2k crashes nodes?

Axel akoh... at gmail.com
Thu Jul 28 17:38:51 CEST 2011



On Thursday, July 28, 2011 5:55:43 AM UTC-4, sassy wrote:
>
> Dear all,
>
> maybe less of a specific cp2k problem but I was wondering if somebody on 
> the 
> list has made similar experiences before and could comment on it.
>
> We have a number (18) of a bit dated InfiniBand Opteron dual core 2220 
> nodes in 
> one cluster. For that last 6 weeks or so cp2k runs tend to crash the node 
> (i.e. kernel panic). In order to rule out any OS related problems I have 
> upgrade the OS to Debian Squeeze and compiled the latest version of cp2k on 
> it 
> using the gfortran 4.4.5 compiler with the Intel MKL which comes with the 
> Intel Fortan Compiler 11.1.073. Compilation on that node went without 
>
you didn't say what version of OFED or else you are running to drive the IB 
cards
and what type of IB cards to begin with.

I know that plane wave code is quite memory intense but I find it a bit odd 
> that memtest runs ok and cp2k crashes the nodes. I would like to rule out 
> any 
> other possibility but hardware problems. It is easy for me to say that one 
> or 
> two nodes are gone (due to hardware problems beyond repair) but writing off 
> a 
> complete cluster is a bit more difficult to explain. 
>
> Has anybody made similar experiences and would not mind to share it with 
> me? 
> It can be off-list if they prefer.
>
i've seen similar behavior with infinipath DDR-IB HCAs on some of our nodes.
all applications would run well, but the communication pattern of cp2k would
overload the kernel part of the IB driver and lead to intermittent crashes.
 
cheers,
    axel.

> All the best from a sunny London!
>
> Jörg
>
> -- 
> *************************************************************
> Jörg Saßmannshausen
> University College London
> Department of Chemistry
> Gordon Street
> London
> WC1H 0AJ 
>
> email: j.sas... at ucl.ac.uk
> web: http://sassy.formativ.net
>
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20110728/46882223/attachment.html>


More information about the CP2K-user mailing list