segmentation fault

Nikita vakul... at gmail.com
Tue Jul 26 18:07:14 UTC 2011


Hi Peter!

I'm sorry for the delay in getting back in touch with you! Thank you
very much for you reply! I've added traceback and debug options to
compiler and the problem has disappeared. ("g" option didn't give any
result). Can you tell me how can I get your executable?
Thank you in advance,
Nikita

On Jul 22, 5:35 pm, Peter Mamonov <pmam... at gmail.com> wrote:
> You can try to add '-g' option to compiler and linker options to get
> subroutines names in backtrace output (those lines `cp2k.popt
> 00000000011ADFE0  Unknown               Unknown`). Probably this will
> shed the light on the source of the problem.
>
> Also you can try a working binary, that i use for calculations  (CP2K
> version 2.2.134 (Development Version)). Copy it from
> /home/mamonov/cp2k/cp2k/exe/SKIF-intel/cp2k.impi being on a cluster
> frontend node.
>
> Peter
>
>
>
>
>
>
>
> On Fri, Jul 22, 2011 at 5:01 PM, Nikita <vakul... at gmail.com> wrote:
> > Hi Peter!
>
> > Thank you very much for the ARCH file! Yes, I'm trying to build it on
> > SKIF Chebyshev. But using your arch file I got warning:
>
> > ********************
> > ifort: command line warning #10156: ignoring option '-static'; no
> > argument required
> > make[1]: warning:  Clock skew detected.  Your build may be incomplete.
> > make[1]: Leaving directory `/home/vakula/CP2K_NEW/cp2k/obj/
> > skif_chebyshev/popt'
> > make: warning:  Clock skew detected.  Your build may be incomplete.
> > ********************
>
> > Nevertheless, I had tried to run my calculation with obtained
> > (chebyshev arch) executable and I got the same problem:
>
> > ********************
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image              PC                Routine            Line
> > Source
> > cp2k.popt          00000000011ADFE0  Unknown               Unknown
> > Unknown
> > cp2k.popt          00000000010637CB  Unknown               Unknown
> > Unknown
> > cp2k.popt          00000000004EA0F1  Unknown               Unknown
> > Unknown
> > cp2k.popt          000000000042FAE1  Unknown               Unknown
> > Unknown
> > cp2k.popt          000000000042AB6E  Unknown               Unknown
> > Unknown
> > cp2k.popt          0000000000429B3C  Unknown               Unknown
> > Unknown
> > libc.so.6          00007FF4AAB0FCF4  Unknown               Unknown
> > Unknown
> > cp2k.popt          0000000000429A49  Unknown               Unknown
> > Unknown
> > ********************
>
> > plus a new one
>
> > *******************
> > [55:node-62-01] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 55
> > [61:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > [62:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > [57:node-41-05] unexpected disconnect completion eventrank 4 in job 1
> > t60-2.parallel.ru_42663   caused collective abort of all ranks
> >  exit status of rank 4: killed by signal 9
> >  from [7:node-41-03]
> > [58:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > [56:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 56
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 57
> > [60:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > [63:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 63
> > [51:node-62-01] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 51
> > [52:node-62-01] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 58
> > [59:node-41-05] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 59
> > internal ABORT - process 60
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 61
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 62
> > [50:node-62-01] unexpected disconnect completion event from
> > [7:node-41-03]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 50
> > [1:node-41-03] unexpected disconnect completion event from
> > [33:node-62-07]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 1
> > [0:node-41-03] unexpected disconnect completion event from
> > [33:node-62-07]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 0
> > [2:node-41-03] unexpected disconnect completion event from
> > [33:node-62-07]
> > [3:node-41-03] unexpected disconnect completion event from
> > [33:node-62-07]
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 3
> > Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> > internal ABORT - process 2
> > rank 1 in job 1  t60-2.parallel.ru_42663   caused collective abort of
> > all ranks
> >  exit status of rank 1: killed by signal 9
> > rank 0 in job 1  t60-2.parallel.ru_42663   caused collective abort of
> > all ranks
> >  exit status of rank 0: killed by signal 9
> > ******************************
> > Any suggestions? And do you use CP2K on SKIF? Does it work without any
> > problem?
>
> > Thank you in advance,
> > Nikita
>
> > On Jul 22, 3:37 pm, Peter Mamonov <pmam... at gmail.com> wrote:
> >> Hi Nikita!
>
> >> Correct me if I'm wrong, but as i can see you are trying to build CP2K
> >> for SKIF `Chebyshev`. If so, you can try this (working) arch file for
> >> Chebyshev (see below). Also be sure to switch to intel's compiler
> >> using command:
>
> >> mpi-selector --set intel_mpi_intel64-4.0.0.025
>
> >> and run your task with `-as intel` option:
>
> >> mpirun -as intel -blah -blah blah
>
> >> Best regards,
> >> Peter
> >> --
>
> >> # Chebyshev ARCH file
>
> >> MKL_ROOT  = /opt/intel/mkl/10.2.4.032
> >> MKL_LIB = $(MKL_ROOT)/lib/em64t
> >> MKL_INCLUDE = $(MKL_ROOT)/include
>
> >> CC       = cc
> >> CPP      =
> >> FC       = mpif90
> >> LD       = mpif90
> >> AR       = ar -r
>
> >> DFLAGS   = -D__INTEL -D__FFTSG -D__FFTMKL -D__parallel -D__BLACS
> >> -D__SCALAPACK -D__HAS_NO_ISO_C_BINDING
>
> >> CPPFLAGS = -traditional -C $(DFLAGS) -I$(MKL_INCLUDE)/include
>
> >> FCFLAGS  = -fc=ifort $(DFLAGS) -I$(MKL_INCLUDE) -O2 -xHost
> >> -heap-arrays 64 -fpp -free
>
> >> LDFLAGS  = $(FCFLAGS) -L$(MKL_LIB)
>
> >> LIBS     = -Wl,--start-group \
> >> $(MKL_LIB)/libmkl_scalapack_lp64.a \
> >> $(MKL_LIB)/libmkl_blacs_intelmpi_lp64.a \
> >> $(MKL_LIB)/libmkl_intel_lp64.a \
> >> $(MKL_LIB)/libmkl_sequential.a \
> >> $(MKL_LIB)/libmkl_core.a \
> >> -static-mpi \
> >> -Wl,--end-group
>
> >> OBJECTS_ARCHITECTURE = machine_intel.o
>
> >> # End of Chebyshev ARCH file
>
> >> On Fri, Jul 22, 2011 at 2:56 PM, Jörg Saßmannshausen
>
> >> <j.sassma... at ucl.ac.uk> wrote:
> >> > Dear Nikita,
>
> >> > did you use that before you started a run as well?
>
> >> > Unfortunately, it is some time that I have compiled cp2k so I don't know on
> >> > top of my head. The limit of the stack-size is usually the culprit here.
>
> >> > All the best from a sunny London
>
> >> > Jörg
>
> >> > On Friday 22 July 2011 11:06:51 Nikita wrote:
> >> >> Hi Jörg,
>
> >> >> thanks for your suggestion! But I had already tried this before
> >> >> compilation, but the error occured all the same.
> >> >> Maybe there are some other tricks to overcome this obstacle?
>
> >> >> Thank you in advance,
> >> >> Nikita
>
> >> >> On Jul 22, 1:36 pm, Jörg Saßmannshausen <j.sassma... at ucl.ac.uk>
>
> >> >> wrote:
> >> >> > Hi Nikita,
>
> >> >> > the intel compiler is notorious for doing segfaults.
>
> >> >> > Have you tried:
>
> >> >> > $ ulimit -s unlimited
>
> >> >> > That might cure the problem.
>
> >> >> > Regards
>
> >> >> > Jörg
>
> >> >> > On Friday 22 July 2011 10:13:53 Nikita wrote:
> >> >> > > Dear CP2K users and developers,
>
> >> >> > > I have recently compiled CP2K on our university cluster. But the
> >> >> > > calculation failed with the following message:
> >> >> > > ********************************************
> >> >> > > --------------------------
> >> >> > >  OPTIMIZATION STEP:      2
> >> >> > >  --------------------------
> >> >> > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> >> >> > > Image              PC                Routine            Line
> >> >> > > Source
> >> >> > > cp2k.popt          0000000001195583  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          000000000104A51A  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          00000000004CEDA1  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          000000000040E631  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          00000000004096BE  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          000000000040868C  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > libc.so.6          00007FED46734CF4  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > cp2k.popt          0000000000408599  Unknown               Unknown
> >> >> > > Unknown
> >> >> > > ************************************************
>
> >> >> > > Could you suggest any solutions to my problem? By the way, here is my
> >> >> > > arch file:
>
> >> >> > > **********************************
> >> >> > > INTEL_MKL = /opt/intel/mkl/10.2.4.032
> >> >> > > INTEL_INC = $(INTEL_MKL)/include/fftw
> >> >> > > INTEL_LIB = $(INTEL_MKL)/lib/em64t
>
> >> >> > > CC       = mpicc
> >> >> > > CPP      =
> >> >> > > FC =  mpif90
> >> >> > > LD = mpif90
> >> >> > > AR       = /usr/bin/ar -r
> >> >> > > DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -
> >> >> > > D__FFTW3 -D__LIBINT  -D__HAS_NO_ISO_C_BINDING
> >> >> > > CPPFLAGS = -C -traditional...
>
> read more »


More information about the CP2K-user mailing list