[CP2K:3043] Re: Segfault with psmp

Urban Borštnik urban.b... at gmail.com
Wed Jan 12 09:15:55 CET 2011


Hello,

On Tue, 2011-01-11 at 14:49 +0100, Ondrej Marsalek wrote:
> Additional confusion: the problem occurs within an 'IF (nthread > 1)',
> even if I set OMP_NUM_THREADS=1. When I print nthread in that place, I
> always get '8'. Is cp2k supposed to honor the value of the env
> variable? If not, what is the proper way to set the number of threads?

The OMP_NUM_THREADS environment variable is not interpreted by CP2K but
by the threading library that implements OpenMP.

This variable is obtained by a call to OMP_GET_MAX_THREADS.  Its return
value of 8 in your case seems to conflict with the specified behavior,
which is to return the number of threads to be used in (following) OMP
PARALLEL sections.  These continue to use only one thread.

Cheers,
Urban.

> Thanks,
> Ondrej
> 
> PS:
> The threads can also be seen in gdb:
> 
> Starting program:
> /home/andy/build/cp2k/cp2k/exe/Linux-x86-64-intel/cp2k.ssmp W3-H3O.inp
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff7fd3700 (LWP 32192)]
> 
>   **** **** ******  **  PROGRAM STARTED AT               2011-01-11 14:38:08.422
>  ***** ** ***  *** **   PROGRAM STARTED ON                             cassandra
>  **    ****   ******    PROGRAM STARTED BY                                  andy
>  ***** **    ** ** **   PROGRAM PROCESS ID                                 32189
>   **** **  *******  **  PROGRAM STARTED IN          /home/andy/W3-H3O-min-global
> 
>  CP2K| version string:                 CP2K version 2.2.94 (Development Version)
>  CP2K| is freely available from                          http://cp2k.berlios.de/
>  CP2K| Program compiled at                          Tue Jan 11 14:00:00 CET 2011
>  CP2K| Program compiled on                                             cassandra
>  CP2K| Program compiled for                                   Linux-x86-64-intel
>  CP2K| Last CVS entry
>  CP2K| Input file name                                                W3-H3O.inp
> [New Thread 0x7ffff4fe4700 (LWP 32193)]
> [New Thread 0x7ffff4be3700 (LWP 32194)]
> [New Thread 0x7ffff47e2700 (LWP 32195)]
> [New Thread 0x7fffeffff700 (LWP 32196)]
> [New Thread 0x7fffefbfe700 (LWP 32197)]
> [New Thread 0x7fffef7fd700 (LWP 32198)]
> [New Thread 0x7fffef3fc700 (LWP 32199)]
> 
> 
> On Tue, Jan 11, 2011 at 14:25, Ondrej Marsalek
> <ondrej.... at gmail.com> wrote:
> > I have simplified the problem further by doing a ssmp build. The arch
> > file can be found here:
> >
> > http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >
> > And the problem persists as described, regardless of the number of
> > OpenMP threads.
> >
> > Any ideas how to get this working?
> >
> > Thanks,
> > Ondrej
> >
> >
> > On Thu, Jan 6, 2011 at 14:04, Ondrej Marsalek <ondrej.... at gmail.com> wrote:
> >> Dear all,
> >>
> >> I get a segfault with a psmp build of cp2k trunk. The corresponding
> >> popt works. The problem occurs even when run as a single process and
> >> with OMP_NUM_THREADS=1. This is what it looks like to gdb:
> >>
> >> ==========
> >> ...
> >>  Spin 1
> >>
> >>  Number of electrons:                                                         17
> >>  Number of occupied orbitals:                                                 17
> >>  Number of molecular orbitals:                                                17
> >>
> >>  Spin 2
> >>
> >>  Number of electrons:                                                         16
> >>  Number of occupied orbitals:                                                 16
> >>  Number of molecular orbitals:                                                16
> >>
> >>  Number of orbital functions:                                                169
> >>  Number of independent orbital functions:                                    169
> >>
> >>  Extrapolation method: initial_guess
> >>
> >> Program received signal SIGSEGV, Segmentation fault.
> >> [Switching to Thread 0x7fffef887700 (LWP 23493)]
> >> __libc_free (mem=0x2020202000000001) at malloc.c:3709
> >> 3709    malloc.c: No such file or directory.
> >>        in malloc.c
> >> (gdb) backtrace
> >> #0  __libc_free (mem=0x2020202000000001) at malloc.c:3709
> >> #1  0x000000000257a6ec in for_deallocate ()
> >> #2  0x00000000016d413b in
> >> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__917__par_region0_2_110
> >> ()
> >>    at /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:1163
> >> #3  0x0000000002634763 in L_kmp_invoke_pass_parms ()
> >> #4  0x00007fffffff53fc in ?? ()
> >> #5  0x00007fffffff53c4 in ?? ()
> >> #6  0x00007fffffff5380 in ?? ()
> >> ...
> >> ==========
> >>
> >> The CPU is a Core i7. The arch file and the input that triggers the
> >> segfault (almost immediately after start) are here:
> >>
> >> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> >> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >>
> >> The versions used for the build are:
> >> cp2k trunk checked out today
> >> OpenMPI 1.5
> >> Intel Compiler 11.1.073 and corresponding MKL
> >>
> >> I understand that this might be difficult or impossible to reproduce,
> >> but would be grateful for any suggestions as for how to try to resolve
> >> this.
> >>
> >> Thanks,
> >> Ondrej
> >>
> >
> 





More information about the CP2K-user mailing list