[CP2K:4489] building hybrid MPI+OpenMP

Iain Bethune ibet... at epcc.ed.ac.uk
Mon Jul 15 08:45:06 UTC 2013


Hi Steve,

Try rebuilding without using the -heap-arrays 64 flag, as I have found problems with this in combination with OpenMP before (but not yet got to the root cause).  Let me know if it helps or not!

Thanks

- Iain

--

Iain Bethune
Project Manager, EPCC

Email: ibet... at epcc.ed.ac.uk
Twitter: @IainBethune
Web: http://www2.epcc.ed.ac.uk/~ibethune
Tel/Fax: +44 (0)131 650 5201/6555
Mob: +44 (0)7598317015
Addr: 2404 JCMB, The King's Buildings, Mayfield Road, Edinburgh, EH9 3JZ








On 8 Jul 2013, at 16:09, Steve Schmerler wrote:

> Hello
> 
> I'm trying to compile a hybrid MPI+OpenMP version with
> 
> * ifort 12.1
> * mkl 10.3
> * intel MPI 4.0.3  
> * fftw 3.3.3 threaded
> * scalapack from mkl
> 
> The arch file:
> 
>    -----------------------------------------------------------------------
>    MKL_LIB=$(MKLROOT)/lib/intel64
>    MKL_INC=$(MKLROOT)/include
>    FFTW_LIB=/home/schmerler/soft/lib/fftw/intel/3.3.3/lib
>    FFTW_INC=/home/schmerler/soft/lib/fftw/intel/3.3.3/include
>    CC       = cc
>    CPP      = 
>    FC       = mpiifort
>    LD       = mpiifort
>    AR       = ar -r
>    DFLAGS   = -D__INTEL -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3
>    CPPFLAGS =
>    FCFLAGS  = $(DFLAGS)  -O2 -free -heap-arrays 64 -funroll-loops -fpp -axAVX \
>               -openmp -mt_mpi -I$(FFTW_INC)
>    FCFLAGS2 = $(DFLAGS)  -O1 -free
>    LIBS     = -lfftw3_threads -lfftw3 -liomp5 \
>               -lmkl_blacs_intelmpi_lp64 -lmkl_scalapack_lp64 \
>               -lmkl_intel_lp64 -lmkl_core -lmkl_sequential \
>    LDFLAGS  = $(FCFLAGS) -L$(FFTW_LIB) -I$(FFTW_INC) -L$(MKL_LIB) -I$(MKL_INC) $(LIBS)
> 
>    OBJECTS_ARCHITECTURE = machine_intel.o
>    graphcon.o: graphcon.F
>            $(FC) -c $(FCFLAGS2) $<
>    -----------------------------------------------------------------------
> 
> I see different errors, depending on which combo of MPI tasks and threads is
> used:
> 
> * OMP_NUM_THREADS=1, mpirun -np 1 
> 
>    *****************************************************
>    *** ERROR in cp_fm_syevd_base (MODULE cp_fm_diag) ***
>    *****************************************************
> 
>    *** Matrix diagonalization failed ***
> 
>    *** Program stopped at line number 384 of MODULE cp_fm_diag ***
> 
>    ===== Routine Calling Stack ===== 
> 
>              10 cp_fm_syevd_base
>               9 cp_fm_syevd
>               8 cp_dbcsr_syevd
>               7 subspace_eigenvalues_ks_dbcsr
>               6 prepare_preconditioner
>               5 init_scf_loop
>               4 scf_env_do_scf
>               3 qs_energies_scf
>               2 qs_forces
>               1 CP2K
> 
> * OMP_NUM_THREADS=1, mpirun -np 4
> 
>    MKL ERROR: Parameter 4 was incorrect on entry to DLASCL
>    {    1,    1}:  On entry to 
>    DSTEQR parameter number   -3 had an illegal value 
>    MKL ERROR: Parameter 5 was incorrect on entry to DLASCL
>    {    0,    0}:  On entry to 
>    DSTEQR parameter number   -3 had an illegal value 
> 
>  I had this one before and the reason was that the input geometry + used basis
>  caused NaNs which were apparently passed to a scalapack call. However, the
>  input is OK and works with a pure-MPI build. Therefore, I guess that the OMP
>  code calculates something wrong. Then in the case of 1 core, a serial lapack
>  call fails, while in the parallel case, a scalapack call does.
> 
> * OMP_NUM_THREADS=4, mpirun -np 1
> 
>    Output hangs at "Extrapolation method: initial_guess", only one MPI taks is
>    running, but no threads.
> 
> I wanted to blame the MPI library, but Intel MPI says it supports
> MPI_THREAD_FUNNELED. The same happens if I only link fftw3, not
> fftw3_threads, so it's probably not fftw, either. So am I linking some
> libraries wrong (in which case the problem is probably completety
> trivial and I just don't see it)?
> 
> Thank you for your help!
> 
> best,
> Steve
> 
> -- 
> Steve Schmerler
> Institut für Theoretische Physik
> TU Freiberg, Germany
> 
> -- 
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns... at googlegroups.com.
> To post to this group, send email to cp... at googlegroups.com.
> Visit this group at http://groups.google.com/group/cp2k.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 
> 


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the CP2K-user mailing list