Segfault with psmp
Ondrej Marsalek
ondrej.... at gmail.com
Tue Jan 11 13:49:37 UTC 2011
Additional confusion: the problem occurs within an 'IF (nthread > 1)',
even if I set OMP_NUM_THREADS=1. When I print nthread in that place, I
always get '8'. Is cp2k supposed to honor the value of the env
variable? If not, what is the proper way to set the number of threads?
Thanks,
Ondrej
PS:
The threads can also be seen in gdb:
Starting program:
/home/andy/build/cp2k/cp2k/exe/Linux-x86-64-intel/cp2k.ssmp W3-H3O.inp
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7fd3700 (LWP 32192)]
**** **** ****** ** PROGRAM STARTED AT 2011-01-11 14:38:08.422
***** ** *** *** ** PROGRAM STARTED ON cassandra
** **** ****** PROGRAM STARTED BY andy
***** ** ** ** ** PROGRAM PROCESS ID 32189
**** ** ******* ** PROGRAM STARTED IN /home/andy/W3-H3O-min-global
CP2K| version string: CP2K version 2.2.94 (Development Version)
CP2K| is freely available from http://cp2k.berlios.de/
CP2K| Program compiled at Tue Jan 11 14:00:00 CET 2011
CP2K| Program compiled on cassandra
CP2K| Program compiled for Linux-x86-64-intel
CP2K| Last CVS entry
CP2K| Input file name W3-H3O.inp
[New Thread 0x7ffff4fe4700 (LWP 32193)]
[New Thread 0x7ffff4be3700 (LWP 32194)]
[New Thread 0x7ffff47e2700 (LWP 32195)]
[New Thread 0x7fffeffff700 (LWP 32196)]
[New Thread 0x7fffefbfe700 (LWP 32197)]
[New Thread 0x7fffef7fd700 (LWP 32198)]
[New Thread 0x7fffef3fc700 (LWP 32199)]
On Tue, Jan 11, 2011 at 14:25, Ondrej Marsalek
<ondrej.... at gmail.com> wrote:
> I have simplified the problem further by doing a ssmp build. The arch
> file can be found here:
>
> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
>
> And the problem persists as described, regardless of the number of
> OpenMP threads.
>
> Any ideas how to get this working?
>
> Thanks,
> Ondrej
>
>
> On Thu, Jan 6, 2011 at 14:04, Ondrej Marsalek <ondrej.... at gmail.com> wrote:
>> Dear all,
>>
>> I get a segfault with a psmp build of cp2k trunk. The corresponding
>> popt works. The problem occurs even when run as a single process and
>> with OMP_NUM_THREADS=1. This is what it looks like to gdb:
>>
>> ==========
>> ...
>> Spin 1
>>
>> Number of electrons: 17
>> Number of occupied orbitals: 17
>> Number of molecular orbitals: 17
>>
>> Spin 2
>>
>> Number of electrons: 16
>> Number of occupied orbitals: 16
>> Number of molecular orbitals: 16
>>
>> Number of orbital functions: 169
>> Number of independent orbital functions: 169
>>
>> Extrapolation method: initial_guess
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fffef887700 (LWP 23493)]
>> __libc_free (mem=0x2020202000000001) at malloc.c:3709
>> 3709 malloc.c: No such file or directory.
>> in malloc.c
>> (gdb) backtrace
>> #0 __libc_free (mem=0x2020202000000001) at malloc.c:3709
>> #1 0x000000000257a6ec in for_deallocate ()
>> #2 0x00000000016d413b in
>> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__917__par_region0_2_110
>> ()
>> at /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:1163
>> #3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
>> #4 0x00007fffffff53fc in ?? ()
>> #5 0x00007fffffff53c4 in ?? ()
>> #6 0x00007fffffff5380 in ?? ()
>> ...
>> ==========
>>
>> The CPU is a Core i7. The arch file and the input that triggers the
>> segfault (almost immediately after start) are here:
>>
>> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
>> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
>>
>> The versions used for the build are:
>> cp2k trunk checked out today
>> OpenMPI 1.5
>> Intel Compiler 11.1.073 and corresponding MKL
>>
>> I understand that this might be difficult or impossible to reproduce,
>> but would be grateful for any suggestions as for how to try to resolve
>> this.
>>
>> Thanks,
>> Ondrej
>>
>
More information about the CP2K-user
mailing list