Sigsegv error during cell optimization

Maricarme... at cemes.fr Maricarme... at cemes.fr
Wed May 6 13:24:45 UTC 2009


Thanks Teo,

Actually the Intel fortran compiler is version 10.1.017. I can't find
any comments on this particular version. I found something on 10.1.018
though, and it semmed to work fin.
In the machine there is also version 11.0.83, but I actually found
some message on the list reporting problems with latests compilers
(e.g. versions 11).
For the plain popt CP2K version I'll have to ask the administrators to
recompile the code (they did it the first time), so I might as well
ask them to use the newer compiler this time. Otherwise, do you think
it's better to compile to the popt version with the same compiler
(e.g. 10.1.017)?

Ciao,

Maricarmen



On 6 mai, 09:56, Teodoro Laino <teodor... at gmail.com> wrote:
> Hi Maricarmen,
>
> could you try a plain popt version without the smp support?
> Keep as well in the submission script ompthreads=1.
>
> which version of intel compiler are you using? did you check on this
> mailing list that it is a "good one"?
> In case, do you have access to other compilers on that machine?
>
> Teo
>
> Maricarme... at cemes.fr wrote:
> > Hello everyone,
>
> > I'm running a DFT cell optimization for Mx-V4O11 crystals (M = Ag and
> > Cu). My cells are approximately 14x7x7 and about 260 atoms. Below is a
> > copy of one of my input files. The problem is I keep getting a SIGSEGV
> > (11) error, usually when starting the SCF cycles for the second cell
> > opt step (an extract from the output file is also below).
> > I'm running parallel on a calculus center (http://www.cines.fr/
> > spip.php?rubrique186), and the administrators have already checked for
> > the stack size (which according to them is set to unlimited). Below is
> > also a copy of the job submission's file, and of the arch file.
> > I even tried to run a cell opt test for a smaller cell (14*3*3, about
> > 68 atoms), which I had already ran in a different calculus center
> > without any issues, and I will still get the segmentation fault error.
> > This clearly indicates me that the problem is associated to a
> > configuration of the machines, to the way CP2K was installed, or to
> > the job submission's characteristics (or to something else??). I must
> > say I always get the exact same error during cell opt's second step,
> > no matter what the system is (small or big cell, Ag or Cu).
> > I tried running an Energy test on the smaller cell and it worked fine.
>
> > I would really appreciate if any of you can throw some light at this,
> > for I'm pretty stuck on it right now.
>
> > Cheers,
>
> > Maricarmen.
>
> > Arch file:
>
> > # by default some intel compilers put temporaries on the stack
> > # this might lead to segmentation faults if the stack limit is set to
> > low
> > # stack limits can be increased by sysadmins or e.g with ulimit -s
> > 256000
> > # Tested on a HPC non-Itanium clusters @ UDS (France)
> > # Note: -O2 produces an executable which is slightly faster than -O3
> > # and the compilation time was also much shorter.
> > CC       = icc -diag-disable remark
> > CPP      =
> > FC       = ifort -diag-disable remark -openmp
> > LD       = ifort -diag-disable remark -openmp
> > AR       = ar -r
>
> > #Better with mkl (intel lapack/blas) only
> > #DFLAGS   = -D__INTEL -D__FFTSG -D__parallel
> > #If you want to use BLACS and SCALAPACK use the flags below
> > DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -
> > D__FFTW3
> > CPPFLAGS =
> > FCFLAGS  = $(DFLAGS) -fpp -free -O3 -xS -I/opt/software/SGI/intel/mkl/
> > 10.0.3.020/include -I/opt/software/SGI/intel/mkl/10.0.3.020/include/
> > fftw
> > LDFLAGS  =  -L/opt/software/SGI/intel/mkl/10.0.3.020/lib/em64t
> > #LIBS     = -lmkl -lm -lpthread -lguide -openmp
> > #If you want to use BLACS and SCALAPACK use the libraries below
> > LIBS     = -Wl,--allow-multiple-definition -lmkl_scalapack_lp64 /
> > scratch/grisolia/blacsF77init_MPI-LINUX-0.a /scratch/grisolia/
> > blacs_MPI-LINUX-0.a -lmpi -lmkl -lfftw3xf_intel -lmkl_blacs_lp64
>
> > OBJECTS_ARCHITECTURE = machine_intel.o
>
> > -------
>
> > Job submission's file (getting the sigsegv error):
>
> > #PBS -N cp2k
> > #PBS -l walltime=24:00:00
> > #PBS -S /bin/bash
> > #PBS -l select=8:ncpus=8:mpiprocs=8:ompthreads=1
> > #PBS -j oe
> > #PBS -M  gris... at cemes.fr -m abe
>
> > PBS_O_WORKDIR=/scratch/grisolia/CuVO/Fixed/
>
> > cd $PBS_O_WORKDIR
>
> > export OMP_NUM_THREADS=1
> > export MKL_NUM_THREADS=1
> > export MPI_GROUP_MAX=512
>
> > /usr/pbs/bin/mpiexec /scratch/grisolia/cp2k/exe/Linux-x86-64-jade/
> > cp2k.psmp CuV4O11-CellOpt.inp
>
> > --------------
>
> > Input file:
>
> > &GLOBAL
> >   PROJECT     CuV4O11-CellOpt
> >   RUN_TYPE    CELL_OPT
> >   PRINT_LEVEL MEDIUM
> >   WALLTIME  86000
> > &END GLOBAL
> > &FORCE_EVAL
> >   METHOD Quickstep
> >   &DFT
> >     BASIS_SET_FILE_NAME /scratch/grisolia/cp2k/tests/QS/BASIS_MOLOPT
> >     POTENTIAL_FILE_NAME /scratch/grisolia/cp2k/tests/QS/GTH_POTENTIALS
> >     LSD
> >     &MGRID
> >       CUTOFF 280
> >       NGRIDS 5
> >     &END MGRID
> >     &QS
> >       EPS_DEFAULT   1.0E-10
> >       EXTRAPOLATION PS
> >       EXTRAPOLATION_ORDER 1
> >     &END QS
> >     &SCF
> >       SCF_GUESS RESTART
> >       EPS_SCF 2.0E-7
> >       MAX_SCF 30
> >       &OUTER_SCF
> >          EPS_SCF 2.0E-7
> >          MAX_SCF 15
> >       &END
> >       &OT
> >         MINIMIZER CG
> >         PRECONDITIONER FULL_SINGLE_INVERSE
> >         ENERGY_GAP 0.05
> >       &END
> >       &PRINT
> >          &RESTART
> >             FILENAME = CuV4O11-CellOpt.wfn
> >          &END
> >       &END
> >     &END SCF
> >     &XC
> >       &XC_FUNCTIONAL PBE
> >       &END XC_FUNCTIONAL
> >     &END XC
> >      &PRINT
> >        &MO_CUBES
> >           WRITE_CUBE F
> >           NLUMO      20
> >           NHOMO      20
> >        &END
> >      &END
> >   &END DFT
> >   &SUBSYS
> >     &CELL
> >       @INCLUDE CuV4O11-GeoOpt.cell
> >     &END CELL
> >     &COORD
> >       @INCLUDE CuV4O11-GeoOpt.coord
> >     &END COORD
> >     &END COORD
> >     &KIND Cu
> >       BASIS_SET DZVP-MOLOPT-SR-GTH
> >       POTENTIAL GTH-PBE-q11
> >     &END KIND
> >     &KIND O
> >       BASIS_SET DZVP-MOLOPT-SR-GTH
> >       POTENTIAL GTH-PBE-q6
> >     &END KIND
> >     &KIND V
> >       BASIS_SET DZVP-MOLOPT-SR-GTH
> >       POTENTIAL GTH-PBE-q13
> >     &END KIND
> >   &END SUBSYS
> >   STRESS_TENSOR ANALYTICAL
> > &END FORCE_EVAL
> > &MOTION
> >   &MD
> >       TIMESTEP [fs] 0.5
> >       STEPS         10000
> >       TEMPERATURE   500
> >       ENSEMBLE      NVE
> >   &END
> >   &CELL_OPT
> >     TYPE GEO_OPT
> >     OPTIMIZER CG
> >     MAX_ITER 20
> >     EXTERNAL_PRESSURE [bar] 0.0
> >     MAX_DR 0.02
> >     RMS_DR 0.01
> >     MAX_FORCE 0.002
> >     RMS_FORCE 0.001
> >     KEEP_ANGLES T
> >     &CG
> >       &LINE_SEARCH
> >         TYPE 2PNT
> >         &2PNT
> >         &END
> >       &END
> >     &END
> >   &END
> >   &GEO_OPT
> >     MAX_ITER 300
> >     MINIMIZER LBFGS
> >   &END
> > &END
>
> > -------
>
> > Extract from the output file ( sigsegv error):
>
> > MPI: On host r17i2n5, Program /scratch/cem6039/grisolia/cp2k/exe/Linux-
> > x86-64-jade/cp2k.psmp, Rank 0, Process 4568 received signal SIGSEGV
> > (11)
>
> > MPI: --------stack traceback-------
> > MPI: On host r17i3n12, Program /scratch/cem6039/grisolia/cp2k/exe/
> > Linux-x86-64-jade/cp2k.psmp, Rank 57, Process 29665 received signal
> > SIGSEGV(11)
>
> > MPI: --------stack traceback-------
> > MPI: On host r17i3n0, Program /scratch/cem6039/grisolia/cp2k/exe/Linux-
> > x86-64-jade/cp2k.psmp, Rank 25, Process 542 received signal SIGSEGV
> > (11)
>
> > MPI: --------stack traceback-------
> > MPI: On host r17i3n1, Program /scratch/cem6039/grisolia/cp2k/exe/Linux-
> > x86-64-jade/cp2k.psmp, Rank 32, Process 5057 received signal SIGSEGV
> > (11)
>
> > MPI: --------stack traceback-------
> > MPI: GNU gdb 6.6
> > MPI: Copyright (C) 2006 Free Software Foundation, Inc.
> > MPI: GDB is free software, covered by the GNU General Public License,
> > and you are
> > MPI: welcome to change it and/or distribute copies of it under certain
> > conditions.
> > MPI: Type "show copying" to see the conditions.
> > MPI: There is absolutely no warranty for GDB.  Type "show warranty"
> > for details.
> > MPI: This GDB was configured as "x86_64-suse-linux"...
> > MPI: Using host libthread_db library "/lib64/libthread_db.so.1".
> > MPI: Attaching to program: /proc/4568/exe, process 4568
> > MPI: [Thread debugging using libthread_db enabled]
> > MPI: [New Thread 46912551614368 (LWP 4568)]
> > MPI: [New Thread 1073809728 (LWP 4588)]
> > MPI: 0x00002aaaad94073f in waitpid () from /lib64/libpthread.so.0
> > MPI: (gdb) #0  0x00002aaaad94073f in waitpid () from /lib64/
> > libpthread.so.0
> > MPI: #1  0x00002aaaaadb5133 in MPI_SGI_stacktraceback () from /usr/
> > lib64/libmpi.so
> > MPI: #2  0x00002aaaaadb5773 in slave_sig_handler () from /usr/lib64/
> > libmpi.so
> > MPI: #3  <signal handler called>
> > MPI: #4  0x00000000017f7ad0 in fftw_destroy_plan ()
> > MPI: #5  0x00000000017f794d in dfftw_destroy_plan_ ()
> > MPI: #6  0x000000000169332a in fftw3_destroy_plan_ ()
> > MPI: #7  0x000000000169199e in fft_destroy_plan_ ()
> > MPI: #8  0x000000000044229e in
> > fft_tools_mp_deallocate_fft_scratch_type_ ()
> > MPI: #9  0x00000000004678ae in fft_tools_mp_resize_fft_scratch_pool_
> > ()
> > MPI: #10 0x00000000004556c8 in fft_tools_mp_get_fft_scratch_ ()
> > MPI: #11 0x000000000046ca53 in fft_tools_mp_fft3d_ps_ ()
> > MPI: #12 0x00000000007360c9 in pw_methods_mp_fft_wrap_pw1pw2_ ()
> > MPI: #13 0x0000000000732a01 in pw_methods_mp_pw_transfer_ ()
> > MPI: #14 0x0000000000778bcd in qs_collocate_density_mp_density_rs2pw_
> > ()
> > MPI: #15 0x0000000000777ba3 in
> > qs_collocate_density_mp_calculate_rho_elec_ ()
> > MPI: #16 0x00000000008fdda2 in qs_rho_methods_mp_qs_rho_update_rho_ ()
> > MPI: #17 0x00000000009647e0 in
> > qs_wf_history_methods_mp_wfi_extrapolate_ ()
> > MPI: #18 0x0000000000916d5b in qs_scf_mp_scf_env_initial_rho_setup_ ()
> > MPI: #19 0x00000000009147fc in qs_scf_mp_init_scf_run_ ()
> > MPI: #20 0x0000000000902413 in qs_scf_mp_scf_ ()
> > MPI: #21 0x000000000078b347 in qs_energy_mp_qs_energies_scf_ ()
> > MPI: #22 0x000000000078a20d in qs_energy_mp_qs_energies_ ()
> > MPI: #23 0x0000000000799e54 in qs_force_mp_qs_forces_ ()
> > MPI: #24 0x000000000049403c in
> > force_env_methods_mp_force_env_calc_energy_force_ ()
> > MPI: #25 0x00000000004ba72f in cp_eval_at_ ()
> > MPI: #26 0x0000000000ef4986 in
> > cp_lbfgs_optimizer_gopt_mp_cp_opt_gopt_step_ ()
> > MPI: #27 0x0000000000eee975 in
> > cp_lbfgs_optimizer_gopt_mp_cp_opt_gopt_next_ ()
> > MPI: #28 0x0000000000eee653 in cp_lbfgs_geo_mp_geoopt_lbfgs_ ()
> > MPI: #29 0x00000000004b3539 in geo_opt_mp_cp_geo_opt_low_ ()
> > MPI: #30 0x00000000004b376d in geo_opt_mp_cp_geo_opt_ ()
> > MPI: #31 0x00000000004b977e in cp_eval_at_ ()
> > MPI: #32 0x0000000000e23d5c in cg_utils_mp_linmin_2pnt_ ()
> > MPI: #33
>
> ...
>
> plus de détails »


More information about the CP2K-user mailing list