MPI_wait problem in cp2k 4.1 with openmpi_2.0.0
jim wang
jimw... at gmail.com
Tue Mar 28 11:25:56 UTC 2017
*Thanks for your reply!*
Here are two arch files for cp2k-2.1 and cp2k-4.1:
*(1) cp2k-2.1: With openMPI-1.6.5, compiled with 4 cores, popt version*
CC = mpicc
CPP =
FC = mpif90
LD = mpif90
AR = ar -r
DFLAGS = -D__INTEL -D__FFTSG -D__FFTW3 -D__parallel -D__BLACS
-D__SCALAPACK -D__MKL
CPPFLAGS =
MKLROOT = /public/software/compiler/intel/composer_xe_2015.2.164/mkl
INTEL_INC=
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/include
FFTW3_INC= /public/home/wj/Codes/fftw-3.3.4/include/
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O2 -msse2 -heap-arrays
64 -funroll-loops -fpp -free
FCFLAGS2 = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O1 -msse2 -heap-arrays
64 -fpp -free
LDFLAGS = $(FCFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC)
LIBS = /public/home/wj/Codes/fftw-3.3.4/lib/libfftw3.a
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_scalapack_lp64.a
-Wl,--
start-group
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.a
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_sequential.a
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.a
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.a
-Wl,--end-group -lpthread
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
*cp2k-4.1: Complied using 4 cores with openMPI_2.0.0, popt version*
CC = icc
#CPP = /lib/cpp
FC = mpif90 -FR
FC_fixed = mpif90 -FI
LD = mpif90
AR = /usr/bin/ar -r
FFTW_INC=${MKLROOT}/include/fftw
INTEL_INC=${MKLROOT}/include
DFLAGS = -D__INTEL -D__FFTW3 -D__MKL -D__parallel -D__BLACS -D__SCALAPACK
CPPFLAGS = -C $(DFLAGS) -P -traditional -I${FFTW_INC} -I${INTEL_INC}
FCFLAGS = -O2 -pc64 -unroll -heap-arrays 64 -xHost -fpp -free
-I${FFTW_INC} -I${INTEL_INC}
LDFLAGS = $(FCFLAGS) -L$(HOME)/lib -L${MKLROOT}/lib/intel64
LDFLAGS_C = $(FCFLAGS) -L$(HOME)/lib -L${MKLROOT}/lib/intel64 -nofor_main
#If you want to use BLACS and SCALAPACK use the libraries below
LIBS =
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_scalapack_lp64.a
\
-Wl,--start-group
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.a
\
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_sequential.a
\
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.a
\
/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.a
-Wl,--end-group -lpthread \
${MKLROOT}/interfaces/fftw3xf/libfftw3xf_intel.a
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
Both jobs with cp2k-2.1 and cp2k-4.1 were carried out using 24 processor in
one node without threading nor openMP. For cp2k-2.1, we use openMPI_1.6.5
and for cp2k-4.1 we use openMPI_2.0.0 which is same to the compiilation
enviroment.
The job is a geo_opt task for amorphous solid system consisting of 216
atoms. Are the logs you mentioned just the output files generated by cp2k?
在 2017年3月28日星期二 UTC+8下午4:42:20,Alfio Lazzaro写道:
>
> Hello,
> unfortunately, it is not easy to answer this question without knowing more
> details...
> First of all, which input you are running? Could you attach it? How many
> nodes, MPI ranks, threads you are using and which CP2K version (PSMP or
> POPT)?
> I also assume that you are compiling the two CP2K with the same setup,
> i.e. compile options and library versions...
> Could you attach the two logs?
>
> The problem is that we should first understand where the MPI_wait are
> used. Indeed, it can be that CP2K 4.1 is using more MPI_wait in other
> places.
>
> Alfio
>
> Il giorno lunedì 27 marzo 2017 11:38:52 UTC+2, jim wang ha scritto:
>>
>> Hi, everybody!
>>
>> I am using cp2k 4.1 for the testing in our new cluster. But strangly, the
>> result showed that the cp2k 4.1 version is 3 to 4 times slower than cp2k
>> 2.1 version built on the same cluster. After examining the output file
>> genertated by both binary file running the same job, I found out that the
>> MPI_wait function may be the key problem.
>>
>> Here is the result of time consumed by MPI_wait function:
>> 1. cp2k 4.1: MPI_wait time:1131(s) , Total run time: 1779(s)
>> 2. cp2k 2.1: MPI_wait time:68(s), Total run time: 616(s)
>>
>> How can I determine whether the problem should be with our cluster or the
>> compilation?
>> Hope you guys can give me some hints on the version comparison.
>>
>> THANKS!!!
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170328/84fd7bc1/attachment.htm>
More information about the CP2K-user
mailing list