[CP2K:3383] Re: segmentation fault
Peter Mamonov
pmam... at gmail.com
Fri Jul 22 13:35:29 UTC 2011
You can try to add '-g' option to compiler and linker options to get
subroutines names in backtrace output (those lines `cp2k.popt
00000000011ADFE0 Unknown Unknown`). Probably this will
shed the light on the source of the problem.
Also you can try a working binary, that i use for calculations (CP2K
version 2.2.134 (Development Version)). Copy it from
/home/mamonov/cp2k/cp2k/exe/SKIF-intel/cp2k.impi being on a cluster
frontend node.
Peter
On Fri, Jul 22, 2011 at 5:01 PM, Nikita <vakul... at gmail.com> wrote:
> Hi Peter!
>
> Thank you very much for the ARCH file! Yes, I'm trying to build it on
> SKIF Chebyshev. But using your arch file I got warning:
>
> ********************
> ifort: command line warning #10156: ignoring option '-static'; no
> argument required
> make[1]: warning: Clock skew detected. Your build may be incomplete.
> make[1]: Leaving directory `/home/vakula/CP2K_NEW/cp2k/obj/
> skif_chebyshev/popt'
> make: warning: Clock skew detected. Your build may be incomplete.
> ********************
>
> Nevertheless, I had tried to run my calculation with obtained
> (chebyshev arch) executable and I got the same problem:
>
> ********************
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image PC Routine Line
> Source
> cp2k.popt 00000000011ADFE0 Unknown Unknown
> Unknown
> cp2k.popt 00000000010637CB Unknown Unknown
> Unknown
> cp2k.popt 00000000004EA0F1 Unknown Unknown
> Unknown
> cp2k.popt 000000000042FAE1 Unknown Unknown
> Unknown
> cp2k.popt 000000000042AB6E Unknown Unknown
> Unknown
> cp2k.popt 0000000000429B3C Unknown Unknown
> Unknown
> libc.so.6 00007FF4AAB0FCF4 Unknown Unknown
> Unknown
> cp2k.popt 0000000000429A49 Unknown Unknown
> Unknown
> ********************
>
> plus a new one
>
> *******************
> [55:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 55
> [61:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [62:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [57:node-41-05] unexpected disconnect completion eventrank 4 in job 1
> t60-2.parallel.ru_42663 caused collective abort of all ranks
> exit status of rank 4: killed by signal 9
> from [7:node-41-03]
> [58:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> [56:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 56
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 57
> [60:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> [63:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 63
> [51:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 51
> [52:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 58
> [59:node-41-05] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 59
> internal ABORT - process 60
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 61
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 62
> [50:node-62-01] unexpected disconnect completion event from
> [7:node-41-03]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 50
> [1:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 1
> [0:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 0
> [2:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> [3:node-41-03] unexpected disconnect completion event from
> [33:node-62-07]
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 3
> Assertion failed in file ../../dapl_module_util.c at line 1593: 0
> internal ABORT - process 2
> rank 1 in job 1 t60-2.parallel.ru_42663 caused collective abort of
> all ranks
> exit status of rank 1: killed by signal 9
> rank 0 in job 1 t60-2.parallel.ru_42663 caused collective abort of
> all ranks
> exit status of rank 0: killed by signal 9
> ******************************
> Any suggestions? And do you use CP2K on SKIF? Does it work without any
> problem?
>
> Thank you in advance,
> Nikita
>
> On Jul 22, 3:37 pm, Peter Mamonov <pmam... at gmail.com> wrote:
>> Hi Nikita!
>>
>> Correct me if I'm wrong, but as i can see you are trying to build CP2K
>> for SKIF `Chebyshev`. If so, you can try this (working) arch file for
>> Chebyshev (see below). Also be sure to switch to intel's compiler
>> using command:
>>
>> mpi-selector --set intel_mpi_intel64-4.0.0.025
>>
>> and run your task with `-as intel` option:
>>
>> mpirun -as intel -blah -blah blah
>>
>> Best regards,
>> Peter
>> --
>>
>> # Chebyshev ARCH file
>>
>> MKL_ROOT = /opt/intel/mkl/10.2.4.032
>> MKL_LIB = $(MKL_ROOT)/lib/em64t
>> MKL_INCLUDE = $(MKL_ROOT)/include
>>
>> CC = cc
>> CPP =
>> FC = mpif90
>> LD = mpif90
>> AR = ar -r
>>
>> DFLAGS = -D__INTEL -D__FFTSG -D__FFTMKL -D__parallel -D__BLACS
>> -D__SCALAPACK -D__HAS_NO_ISO_C_BINDING
>>
>> CPPFLAGS = -traditional -C $(DFLAGS) -I$(MKL_INCLUDE)/include
>>
>> FCFLAGS = -fc=ifort $(DFLAGS) -I$(MKL_INCLUDE) -O2 -xHost
>> -heap-arrays 64 -fpp -free
>>
>> LDFLAGS = $(FCFLAGS) -L$(MKL_LIB)
>>
>> LIBS = -Wl,--start-group \
>> $(MKL_LIB)/libmkl_scalapack_lp64.a \
>> $(MKL_LIB)/libmkl_blacs_intelmpi_lp64.a \
>> $(MKL_LIB)/libmkl_intel_lp64.a \
>> $(MKL_LIB)/libmkl_sequential.a \
>> $(MKL_LIB)/libmkl_core.a \
>> -static-mpi \
>> -Wl,--end-group
>>
>> OBJECTS_ARCHITECTURE = machine_intel.o
>>
>> # End of Chebyshev ARCH file
>>
>> On Fri, Jul 22, 2011 at 2:56 PM, Jörg Saßmannshausen
>>
>>
>>
>>
>>
>>
>>
>> <j.sassma... at ucl.ac.uk> wrote:
>> > Dear Nikita,
>>
>> > did you use that before you started a run as well?
>>
>> > Unfortunately, it is some time that I have compiled cp2k so I don't know on
>> > top of my head. The limit of the stack-size is usually the culprit here.
>>
>> > All the best from a sunny London
>>
>> > Jörg
>>
>> > On Friday 22 July 2011 11:06:51 Nikita wrote:
>> >> Hi Jörg,
>>
>> >> thanks for your suggestion! But I had already tried this before
>> >> compilation, but the error occured all the same.
>> >> Maybe there are some other tricks to overcome this obstacle?
>>
>> >> Thank you in advance,
>> >> Nikita
>>
>> >> On Jul 22, 1:36 pm, Jörg Saßmannshausen <j.sassma... at ucl.ac.uk>
>>
>> >> wrote:
>> >> > Hi Nikita,
>>
>> >> > the intel compiler is notorious for doing segfaults.
>>
>> >> > Have you tried:
>>
>> >> > $ ulimit -s unlimited
>>
>> >> > That might cure the problem.
>>
>> >> > Regards
>>
>> >> > Jörg
>>
>> >> > On Friday 22 July 2011 10:13:53 Nikita wrote:
>> >> > > Dear CP2K users and developers,
>>
>> >> > > I have recently compiled CP2K on our university cluster. But the
>> >> > > calculation failed with the following message:
>> >> > > ********************************************
>> >> > > --------------------------
>> >> > > OPTIMIZATION STEP: 2
>> >> > > --------------------------
>> >> > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
>> >> > > Image PC Routine Line
>> >> > > Source
>> >> > > cp2k.popt 0000000001195583 Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 000000000104A51A Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 00000000004CEDA1 Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 000000000040E631 Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 00000000004096BE Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 000000000040868C Unknown Unknown
>> >> > > Unknown
>> >> > > libc.so.6 00007FED46734CF4 Unknown Unknown
>> >> > > Unknown
>> >> > > cp2k.popt 0000000000408599 Unknown Unknown
>> >> > > Unknown
>> >> > > ************************************************
>>
>> >> > > Could you suggest any solutions to my problem? By the way, here is my
>> >> > > arch file:
>>
>> >> > > **********************************
>> >> > > INTEL_MKL = /opt/intel/mkl/10.2.4.032
>> >> > > INTEL_INC = $(INTEL_MKL)/include/fftw
>> >> > > INTEL_LIB = $(INTEL_MKL)/lib/em64t
>>
>> >> > > CC = mpicc
>> >> > > CPP =
>> >> > > FC = mpif90
>> >> > > LD = mpif90
>> >> > > AR = /usr/bin/ar -r
>> >> > > DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -
>> >> > > D__FFTW3 -D__LIBINT -D__HAS_NO_ISO_C_BINDING
>> >> > > CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
>> >> > > FCFLAGS = $(DFLAGS) -I$(INTEL_INC) \
>> >> > > -O2 -xW -funroll-loops -fpp -free -heap-arrays 64
>> >> > > LDFLAGS = $(FCFLAGS) -I$(INTEL_INC) -i-static
>> >> > > LIBS = -L$(INTEL_LIB) -lmkl_scalapack_lp64 -
>> >> > > lmkl_blacs_intelmpi_lp64 \
>> >> > > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
>> >> > > /home/vakula/CP2K_NEW/fftw3_compiled/lib/libfftw3.a \
>> >> > > /home/vakula/CP2K_NEW/cp2k/tools/hfx_tools/libint_tools/
>> >> > > libint_cpp_wrapper.o \
>> >> > > /home/vakula/CP2K_NEW/libint_compiled/lib/libderiv.a \
>> >> > > /home/vakula/CP2K_NEW/libint_compiled/lib/libint.a \
>> >> > > -lstdc++
>>
>> >> > > OBJECTS_ARCHITECTURE = machine_intel.o
>> >> > > ******************************
>>
>> >> > > Thank you in advance,
>> >> > > Nikita Vakula
>> >> > > PhD student, Moscow State University
>>
>> >> > --
>> >> > *************************************************************
>> >> > Jörg Saßmannshausen
>> >> > University College London
>> >> > Department of Chemistry
>> >> > Gordon Street
>> >> > London
>> >> > WC1H 0AJ
>>
>> >> > email: j.sassma... at ucl.ac.uk
>> >> > web:http://sassy.formativ.net
>>
>> >> > Please avoid sending me Word or PowerPoint attachments.
>> >> > Seehttp://www.gnu.org/philosophy/no-word-attachments.html
>>
>> > --
>> > *************************************************************
>> > Jörg Saßmannshausen
>> > University College London
>> > Department of Chemistry
>> > Gordon Street
>> > London
>> > WC1H 0AJ
>>
>> > email: j.sassma... at ucl.ac.uk
>> > web:http://sassy.formativ.net
>>
>> > Please avoid sending me Word or PowerPoint attachments.
>> > Seehttp://www.gnu.org/philosophy/no-word-attachments.html
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups "cp2k" group.
>> > To post to this group, send email to cp... at googlegroups.com.
>> > To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
>> > For more options, visit this group athttp://groups.google.com/group/cp2k?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To post to this group, send email to cp... at googlegroups.com.
> To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cp2k?hl=en.
>
>
More information about the CP2K-user
mailing list