[CP2K:3383] Re: segmentation fault

Peter Mamonov pmam... at gmail.com
Fri Jul 22 13:35:29 UTC 2011


You can try to add '-g' option to compiler and linker options to get
subroutines names in backtrace output (those lines `cp2k.popt
00000000011ADFE0  Unknown               Unknown`). Probably this will
shed the light on the source of the problem.

Also you can try a working binary, that i use for calculations  (CP2K
version 2.2.134 (Development Version)). Copy it from
/home/mamonov/cp2k/cp2k/exe/SKIF-intel/cp2k.impi being on a cluster
frontend node.

Peter

On Fri, Jul 22, 2011 at 5:01 PM, Nikita <vakul... at gmail.com> wrote:
> Hi Peter!
>
> Thank you very much for the ARCH file! Yes, I'm trying to build it on
> SKIF Chebyshev. But using your arch file I got warning:
>
> ********************
> ifort: command line warning #10156: ignoring option '-static'; no
> argument required
> make[1]: warning:  Clock skew detected.  Your build may be incomplete.
> make[1]: Leaving directory `/home/vakula/CP2K_NEW/cp2k/obj/
> skif_chebyshev/popt'
> make: warning:  Clock skew detected.  Your build may be incomplete.
> ********************
>
> Nevertheless, I had tried to run my calculation with obtained
> (chebyshev arch) executable and I got the same problem:
>
> ********************
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line
> Source
> cp2k.popt          00000000011ADFE0  Unknown               Unknown
> Unknown
> cp2k.popt          00000000010637CB  Unknown               Unknown
> Unknown
> cp2k.popt          00000000004EA0F1  Unknown               Unknown
> Unknown
> cp2k.popt          000000000042FAE1  Unknown               Unknown
> Unknown
> cp2k.popt          000000000042AB6E  Unknown               Unknown
> Unknown
> cp2k.popt          0000000000429B3C  Unknown               Unknown
> Unknown
> libc.so.6          00007FF4AAB0FCF4  Unknown               Unknown
> Unknown
> cp2k.popt          0000000000429A49  Unknown               Unknown
> Unknown
> ********************
>
> plus a new one
>
> *******************
> [55:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 55
> [61:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [62:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [57:node-41-05] unexpected disconnect completion eventrank 4 in job 1
> t60-2.parallel.ru_42663   caused collective abort of all ranks
>  exit status of rank 4: killed by signal 9
>  from [7:node-41-03]
> [58:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [56:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 56
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 57
> [60:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> [63:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 63
> [51:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 51
> [52:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 58
> [59:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 59
> internal ABORT - process 60
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 61
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 62
> [50:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 50
> [1:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 1
> [0:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 0
> [2:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> [3:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 3
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 2
> rank 1 in job 1  t60-2.parallel.ru_42663   caused collective abort of
> all ranks
>  exit status of rank 1: killed by signal 9
> rank 0 in job 1  t60-2.parallel.ru_42663   caused collective abort of
> all ranks
>  exit status of rank 0: killed by signal 9
> ******************************
> Any suggestions? And do you use CP2K on SKIF? Does it work without any
> problem?
>
> Thank you in advance,
> Nikita
>
> On Jul 22, 3:37 pm, Peter Mamonov <pmam... at gmail.com> wrote:
>> Hi Nikita!
>>
>> Correct me if I'm wrong, but as i can see you are trying to build CP2K
>> for SKIF `Chebyshev`. If so, you can try this (working) arch file for
>> Chebyshev (see below). Also be sure to switch to intel's compiler
>> using command:
>>
>> mpi-selector --set intel_mpi_intel64-4.0.0.025
>>
>> and run your task with `-as intel` option:
>>
>> mpirun -as intel -blah -blah blah
>>
>> Best regards,
>> Peter
>> --
>>
>> # Chebyshev ARCH file
>>
>> MKL_ROOT  = /opt/intel/mkl/10.2.4.032
>> MKL_LIB = $(MKL_ROOT)/lib/em64t
>> MKL_INCLUDE = $(MKL_ROOT)/include
>>
>> CC       = cc
>> CPP      =
>> FC       = mpif90
>> LD       = mpif90
>> AR       = ar -r
>>
>> DFLAGS   = -D__INTEL -D__FFTSG -D__FFTMKL -D__parallel -D__BLACS
>> -D__SCALAPACK -D__HAS_NO_ISO_C_BINDING
>>
>> CPPFLAGS = -traditional -C $(DFLAGS) -I$(MKL_INCLUDE)/include
>>
>> FCFLAGS  = -fc=ifort $(DFLAGS) -I$(MKL_INCLUDE) -O2 -xHost
>> -heap-arrays 64 -fpp -free
>>
>> LDFLAGS  = $(FCFLAGS) -L$(MKL_LIB)
>>
>> LIBS     = -Wl,--start-group \
>> $(MKL_LIB)/libmkl_scalapack_lp64.a \
>> $(MKL_LIB)/libmkl_blacs_intelmpi_lp64.a \
>> $(MKL_LIB)/libmkl_intel_lp64.a \
>> $(MKL_LIB)/libmkl_sequential.a \
>> $(MKL_LIB)/libmkl_core.a \
>> -static-mpi \
>> -Wl,--end-group
>>
>> OBJECTS_ARCHITECTURE = machine_intel.o
>>
>> # End of Chebyshev ARCH file
>>
>> On Fri, Jul 22, 2011 at 2:56 PM, Jörg Saßmannshausen
>>
>>
>>
>>
>>
>>
>>
>> <j.sassma... at ucl.ac.uk> wrote:
>> > Dear Nikita,
>>
>> > did you use that before you started a run as well?
>>
>> > Unfortunately, it is some time that I have compiled cp2k so I don't know on
>> > top of my head. The limit of the stack-size is usually the culprit here.
>>
>> > All the best from a sunny London
>>
>> > Jörg
>>
>> > On Friday 22 July 2011 11:06:51 Nikita wrote:
>> >> Hi Jörg,
>>
>> >> thanks for your suggestion! But I had already tried this before
>> >> compilation, but the error occured all the same.
>> >> Maybe there are some other tricks to overcome this obstacle?
>>
>> >> Thank you in advance,
>> >> Nikita
>>
>> >> On Jul 22, 1:36 pm, Jörg Saßmannshausen <j.sassma... at ucl.ac.uk>
>>
>> >> wrote:
>> >> > Hi Nikita,
>>
>> >> > the intel compiler is notorious for doing segfaults.
>>
>> >> > Have you tried:
>>
>> >> > $ ulimit -s unlimited
>>
>> >> > That might cure the problem.
>>
>> >> > Regards
>>
>> >> > Jörg
>>
>> >> > On Friday 22 July 2011 10:13:53 Nikita wrote:
>> >> > > Dear CP2K users and developers,
>>
>> >> > > I have recently compiled CP2K on our university cluster. But the
>> >> > > calculation failed with the following message:
>> >> > > ********************************************
>> >> > > --------------------------
>> >> > >  OPTIMIZATION STEP:      2
>> >> > >  --------------------------
>> >> > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
>> >> > > Image              PC                Routine            Line
>> >> > > Source
>> >> > > cp2k.popt          0000000001195583  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          000000000104A51A  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          00000000004CEDA1  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          000000000040E631  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          00000000004096BE  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          000000000040868C  Unknown               Unknown
>> >> > > Unknown
>> >> > > libc.so.6          00007FED46734CF4  Unknown               Unknown
>> >> > > Unknown
>> >> > > cp2k.popt          0000000000408599  Unknown               Unknown
>> >> > > Unknown
>> >> > > ************************************************
>>
>> >> > > Could you suggest any solutions to my problem? By the way, here is my
>> >> > > arch file:
>>
>> >> > > **********************************
>> >> > > INTEL_MKL = /opt/intel/mkl/10.2.4.032
>> >> > > INTEL_INC = $(INTEL_MKL)/include/fftw
>> >> > > INTEL_LIB = $(INTEL_MKL)/lib/em64t
>>
>> >> > > CC       = mpicc
>> >> > > CPP      =
>> >> > > FC =  mpif90
>> >> > > LD = mpif90
>> >> > > AR       = /usr/bin/ar -r
>> >> > > DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -
>> >> > > D__FFTW3 -D__LIBINT  -D__HAS_NO_ISO_C_BINDING
>> >> > > CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
>> >> > > FCFLAGS  = $(DFLAGS) -I$(INTEL_INC) \
>> >> > >            -O2 -xW -funroll-loops -fpp -free -heap-arrays 64
>> >> > > LDFLAGS  = $(FCFLAGS) -I$(INTEL_INC) -i-static
>> >> > > LIBS     = -L$(INTEL_LIB) -lmkl_scalapack_lp64 -
>> >> > > lmkl_blacs_intelmpi_lp64 \
>> >> > >            -lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
>> >> > >            /home/vakula/CP2K_NEW/fftw3_compiled/lib/libfftw3.a \
>> >> > >            /home/vakula/CP2K_NEW/cp2k/tools/hfx_tools/libint_tools/
>> >> > > libint_cpp_wrapper.o \
>> >> > >            /home/vakula/CP2K_NEW/libint_compiled/lib/libderiv.a \
>> >> > >            /home/vakula/CP2K_NEW/libint_compiled/lib/libint.a \
>> >> > >            -lstdc++
>>
>> >> > > OBJECTS_ARCHITECTURE = machine_intel.o
>> >> > > ******************************
>>
>> >> > > Thank you in advance,
>> >> > > Nikita Vakula
>> >> > > PhD student, Moscow State University
>>
>> >> > --
>> >> > *************************************************************
>> >> > Jörg Saßmannshausen
>> >> > University College London
>> >> > Department of Chemistry
>> >> > Gordon Street
>> >> > London
>> >> > WC1H 0AJ
>>
>> >> > email: j.sassma... at ucl.ac.uk
>> >> > web:http://sassy.formativ.net
>>
>> >> > Please avoid sending me Word or PowerPoint attachments.
>> >> > Seehttp://www.gnu.org/philosophy/no-word-attachments.html
>>
>> > --
>> > *************************************************************
>> > Jörg Saßmannshausen
>> > University College London
>> > Department of Chemistry
>> > Gordon Street
>> > London
>> > WC1H 0AJ
>>
>> > email: j.sassma... at ucl.ac.uk
>> > web:http://sassy.formativ.net
>>
>> > Please avoid sending me Word or PowerPoint attachments.
>> > Seehttp://www.gnu.org/philosophy/no-word-attachments.html
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups "cp2k" group.
>> > To post to this group, send email to cp... at googlegroups.com.
>> > To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
>> > For more options, visit this group athttp://groups.google.com/group/cp2k?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To post to this group, send email to cp... at googlegroups.com.
> To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cp2k?hl=en.
>
>



More information about the CP2K-user mailing list