[CP2K:3043] Re: Segfault with psmp

Alin Marin Elena alinm... at gmail.com
Tue Jan 11 19:45:16 UTC 2011


Hi Ondrej,

Had a look at the cp2k branch too...

with the intel compiler... the same as previous I still get a seg fault but 
the error is different on one thread...

  ----------------------------------- OT 
---------------------------------------

  Step     Update method      Time    Convergence         Total energy    
Change
  ------------------------------------------------------------------------------


forrtl: warning (402): fort: (1): In call to DBCSR_MULT_M_E_E, an array 
temporary was created for argument #4

forrtl: warning (402): fort: (1): In call to DBCSR_MULT_M_E_E, an array 
temporary was created for argument #4

forrtl: severe (408): fort: (2): Subscript #1 of the array RIGHT_COL_MAP has 
value 2 which is greater than the upper bound of 1

Image              PC                Routine            Line        Source             
cp2k.ssmp          0000000006F671BD  Unknown               Unknown  Unknown
cp2k.ssmp          0000000006F65CC5  Unknown               Unknown  Unknown
cp2k.ssmp          0000000006F01410  Unknown               Unknown  Unknown
cp2k.ssmp          0000000006EB066A  Unknown               Unknown  Unknown
cp2k.ssmp          0000000006EB0A62  Unknown               Unknown  Unknown
cp2k.ssmp          0000000006B1D64F  dbcsr_internal_op        1624  
dbcsr_internal_operations.F
cp2k.ssmp          0000000006AF631C  dbcsr_internal_op        1155  
dbcsr_internal_operations.F
libiomp5.so        00007FDFC2E67793  Unknown               Unknown  Unknown


this time even the gnu version crashes....
strange it runs with 1 thread 
seems to run...
with more than 1 crashes in different regions...

regards,
Alin


On Tuesday 11 January 2011 14:49:37 Ondrej Marsalek wrote:
> Additional confusion: the problem occurs within an 'IF (nthread > 1)',
> even if I set OMP_NUM_THREADS=1. When I print nthread in that place, I
> always get '8'. Is cp2k supposed to honor the value of the env
> variable? If not, what is the proper way to set the number of threads?
> 
> Thanks,
> Ondrej
> 
> PS:
> The threads can also be seen in gdb:
> 
> Starting program:
> /home/andy/build/cp2k/cp2k/exe/Linux-x86-64-intel/cp2k.ssmp W3-H3O.inp
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff7fd3700 (LWP 32192)]
> 
>   **** **** ******  **  PROGRAM STARTED AT               2011-01-11
> 14:38:08.422 ***** ** ***  *** **   PROGRAM STARTED ON                     
>        cassandra **    ****   ******    PROGRAM STARTED BY                 
>                 andy ***** **    ** ** **   PROGRAM PROCESS ID             
>                    32189 **** **  *******  **  PROGRAM STARTED IN         
> /home/andy/W3-H3O-min-global
> 
>  CP2K| version string:                 CP2K version 2.2.94 (Development
> Version) CP2K| is freely available from                         
> http://cp2k.berlios.de/ CP2K| Program compiled at                         
> Tue Jan 11 14:00:00 CET 2011 CP2K| Program compiled on                     
>                        cassandra CP2K| Program compiled for                
>                   Linux-x86-64-intel CP2K| Last CVS entry
>  CP2K| Input file name                                               
> W3-H3O.inp [New Thread 0x7ffff4fe4700 (LWP 32193)]
> [New Thread 0x7ffff4be3700 (LWP 32194)]
> [New Thread 0x7ffff47e2700 (LWP 32195)]
> [New Thread 0x7fffeffff700 (LWP 32196)]
> [New Thread 0x7fffefbfe700 (LWP 32197)]
> [New Thread 0x7fffef7fd700 (LWP 32198)]
> [New Thread 0x7fffef3fc700 (LWP 32199)]
> 
> 
> On Tue, Jan 11, 2011 at 14:25, Ondrej Marsalek
> 
> <ondrej.... at gmail.com> wrote:
> > I have simplified the problem further by doing a ssmp build. The arch
> > file can be found here:
> > 
> > http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> > 
> > And the problem persists as described, regardless of the number of
> > OpenMP threads.
> > 
> > Any ideas how to get this working?
> > 
> > Thanks,
> > Ondrej
> > 
> > On Thu, Jan 6, 2011 at 14:04, Ondrej Marsalek <ondrej.... at gmail.com> 
wrote:
> >> Dear all,
> >> 
> >> I get a segfault with a psmp build of cp2k trunk. The corresponding
> >> popt works. The problem occurs even when run as a single process and
> >> with OMP_NUM_THREADS=1. This is what it looks like to gdb:
> >> 
> >> ==========
> >> ...
> >>  Spin 1
> >> 
> >>  Number of electrons:                                                
> >>         17 Number of occupied orbitals:                              
> >>                   17 Number of molecular orbitals:                  
> >>                              17
> >> 
> >>  Spin 2
> >> 
> >>  Number of electrons:                                                
> >>         16 Number of occupied orbitals:                              
> >>                   16 Number of molecular orbitals:                  
> >>                              16
> >> 
> >>  Number of orbital functions:                                        
> >>        169 Number of independent orbital functions:                  
> >>                  169
> >> 
> >>  Extrapolation method: initial_guess
> >> 
> >> Program received signal SIGSEGV, Segmentation fault.
> >> [Switching to Thread 0x7fffef887700 (LWP 23493)]
> >> __libc_free (mem=0x2020202000000001) at malloc.c:3709
> >> 3709    malloc.c: No such file or directory.
> >>        in malloc.c
> >> (gdb) backtrace
> >> #0  __libc_free (mem=0x2020202000000001) at malloc.c:3709
> >> #1  0x000000000257a6ec in for_deallocate ()
> >> #2  0x00000000016d413b in
> >> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__91
> >> 7__par_region0_2_110 ()
> >>    at
> >> /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:11
> >> 63 #3  0x0000000002634763 in L_kmp_invoke_pass_parms ()
> >> #4  0x00007fffffff53fc in ?? ()
> >> #5  0x00007fffffff53c4 in ?? ()
> >> #6  0x00007fffffff5380 in ?? ()
> >> ...
> >> ==========
> >> 
> >> The CPU is a Core i7. The arch file and the input that triggers the
> >> segfault (almost immediately after start) are here:
> >> 
> >> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> >> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >> 
> >> The versions used for the build are:
> >> cp2k trunk checked out today
> >> OpenMPI 1.5
> >> Intel Compiler 11.1.073 and corresponding MKL
> >> 
> >> I understand that this might be difficult or impossible to reproduce,
> >> but would be grateful for any suggestions as for how to try to resolve
> >> this.
> >> 
> >> Thanks,
> >> Ondrej
-- 
I force myself to contradict myself in order to avoid conforming to my own 
taste. -- Marcel Duchamp
Without Questions there are no Answers!
_____________________________________________________________________
Alin Marin ELENA
Advanced Molecular Simulation Research Laboratory  
School of Physics, University College Dublin
----  
Ardionsamblú Móilíneach Saotharlann Taighde
Scoil na Fisice, An Coláiste Ollscoile, Baile Átha Cliath
-----------------------------------------------------------------------------------
Address:
Room 318, UCD Engineering and Material Science Centre
University College Dublin
Belfield, Dublin 4, Ireland
-----------------------------------------------------------------------------------
http://alin.elenaworld.net
alin.... at ucdconnect.ie, alinm... at gmail.com
______________________________________________________________________



More information about the CP2K-user mailing list