building hybrid MPI+OpenMP
Steve Schmerler
elco... at gmail.com
Mon Jul 8 15:09:46 UTC 2013
Hello
I'm trying to compile a hybrid MPI+OpenMP version with
* ifort 12.1
* mkl 10.3
* intel MPI 4.0.3
* fftw 3.3.3 threaded
* scalapack from mkl
The arch file:
-----------------------------------------------------------------------
MKL_LIB=$(MKLROOT)/lib/intel64
MKL_INC=$(MKLROOT)/include
FFTW_LIB=/home/schmerler/soft/lib/fftw/intel/3.3.3/lib
FFTW_INC=/home/schmerler/soft/lib/fftw/intel/3.3.3/include
CC = cc
CPP =
FC = mpiifort
LD = mpiifort
AR = ar -r
DFLAGS = -D__INTEL -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3
CPPFLAGS =
FCFLAGS = $(DFLAGS) -O2 -free -heap-arrays 64 -funroll-loops -fpp -axAVX \
-openmp -mt_mpi -I$(FFTW_INC)
FCFLAGS2 = $(DFLAGS) -O1 -free
LIBS = -lfftw3_threads -lfftw3 -liomp5 \
-lmkl_blacs_intelmpi_lp64 -lmkl_scalapack_lp64 \
-lmkl_intel_lp64 -lmkl_core -lmkl_sequential \
LDFLAGS = $(FCFLAGS) -L$(FFTW_LIB) -I$(FFTW_INC) -L$(MKL_LIB) -I$(MKL_INC) $(LIBS)
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
-----------------------------------------------------------------------
I see different errors, depending on which combo of MPI tasks and threads is
used:
* OMP_NUM_THREADS=1, mpirun -np 1
*****************************************************
*** ERROR in cp_fm_syevd_base (MODULE cp_fm_diag) ***
*****************************************************
*** Matrix diagonalization failed ***
*** Program stopped at line number 384 of MODULE cp_fm_diag ***
===== Routine Calling Stack =====
10 cp_fm_syevd_base
9 cp_fm_syevd
8 cp_dbcsr_syevd
7 subspace_eigenvalues_ks_dbcsr
6 prepare_preconditioner
5 init_scf_loop
4 scf_env_do_scf
3 qs_energies_scf
2 qs_forces
1 CP2K
* OMP_NUM_THREADS=1, mpirun -np 4
MKL ERROR: Parameter 4 was incorrect on entry to DLASCL
{ 1, 1}: On entry to
DSTEQR parameter number -3 had an illegal value
MKL ERROR: Parameter 5 was incorrect on entry to DLASCL
{ 0, 0}: On entry to
DSTEQR parameter number -3 had an illegal value
I had this one before and the reason was that the input geometry + used basis
caused NaNs which were apparently passed to a scalapack call. However, the
input is OK and works with a pure-MPI build. Therefore, I guess that the OMP
code calculates something wrong. Then in the case of 1 core, a serial lapack
call fails, while in the parallel case, a scalapack call does.
* OMP_NUM_THREADS=4, mpirun -np 1
Output hangs at "Extrapolation method: initial_guess", only one MPI taks is
running, but no threads.
I wanted to blame the MPI library, but Intel MPI says it supports
MPI_THREAD_FUNNELED. The same happens if I only link fftw3, not
fftw3_threads, so it's probably not fftw, either. So am I linking some
libraries wrong (in which case the problem is probably completety
trivial and I just don't see it)?
Thank you for your help!
best,
Steve
--
Steve Schmerler
Institut f�r Theoretische Physik
TU Freiberg, Germany
More information about the CP2K-user
mailing list