Compilations with Intel (XE 2013) for CP2K-trunk (2.7dev) & regtests errors
Rolf David
rolf.d... at gmail.com
Wed Jun 10 10:38:25 UTC 2015
Hi all
I've encountered several problem with CP2K compilation (trunk-rev-15402,
popt) with Intel Compiler/MPI/MKL (icc/ifort 14.0.2 : mpi 4.1 Update 2 :
mkl 11.1.2)
First my "out of the box" arch file (libint is 1.1.4, libxc 2.0.1):
CC = mpiicc
CPP =
FC = mpiifort
LD = mpiifort
AR = xiar -r
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__MKL
-D__FFTW3 -D__LIBINT -D__LIBXC2
CPPFLAGS =
FCFLAGS = $(DFLAGS) $(INC) -O3 -axAVX -xSSE4.2 -heap-arrays 64 -funroll-loops
-fpp -free
FCFLAGS2 = $(DFLAGS) $(INC) -O1 -axAVX -xSSE4.2 -heap-arrays 64 -fpp -free
LDFLAGS = $(FCFLAGS)
LIBS = -L$(MKL_LIB) -Wl,-rpath,$(MKL_LIB) \
-lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 \
-lmkl_sequential -lmkl_core \
$(FFTW_LIB)/libfftw3xf_intel.a \
$(LIBINT_LIB)/libderiv.a $(LIBINT_LIB)/libint.a -lstdc++ \
$(LIBXC_LIB)/libxc.a \
-lpthread -lm
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
# In order to avoid segv when HF exchange for example
qs_vxc_atom.o: qs_vxc_atom.F
$(FC) -c $(FCFLAGS2) $<
We are calling it test-O3
Number of FAILED tests 56
> Number of WRONG tests 18
> Number of CORRECT tests 2559
> Number of NEW tests 16
> Total number of tests 2649
> GREPME 56 18 2559 16 2649 X
Most failed are regtesting Fist (regtest
(-5)(-12)(-pol)(-6)(-15)(-1-3)(-4)(-1-2)(-2)(-8)(-9)(-11) (and
/QS/regtest-ot/H2-BECKE-MD.inp,
QMMM/SE/regtest/ mol_CSVR_gen*.inp,QMMM/SE/regtest_2/water_g3x3_excl_*m.inp)
If I do the same with -O2 instead of -O3 (test-O2)
Number of FAILED tests 0
> Number of WRONG tests 17
> Number of CORRECT tests 2616
> Number of NEW tests 16
> Total number of tests 2649
> GREPME 0 17 2616 16 2649 X
So I assume some files has to be compiled with -O1 (on top of the two ones
with -O1) -> Fail segfault
And 10 errors are "unacceptable" (greater than one order : rel error 1e-13
tolerence is 1-14 is considered it ok, but not 1e-12)
and -O1 instead of -O3 (test-O1)
Number of FAILED tests 0
> Number of WRONG tests 81
> Number of CORRECT tests 2568
> Number of NEW tests 0
> Total number of tests 2649
> GREPME 0 81 2568 0 2649 X
More wrong (9 are "unacceptable" but different from -O2)
Also I've tried the -O2 on all, and -O1 on two files : (as hinted by Iain
Bethune in
https://groups.google.com/forum/#!searchin/cp2k/intel$20$20after$3A2014$2F01$2F01/cp2k/YZ3gVI-6Au0/uJZC8QKSzxUJ)
(test-IB)
Number of FAILED tests 166
> Number of WRONG tests 16
> Number of CORRECT tests 2467
> Number of NEW tests 0
> Total number of tests 2649
> GREPME 166 16 2467 0 2649 X
This setup is wrose thant the previous -O2/-O1 files. I assume this was
only valid for 2.5.1 as in the post.
And also using the Arch files from
(http://support.euforia-project.eu/phi/popt/regtest-arch, but without -D__HAS_smm_dnn
-D__HAS_LIBGRID) (test-EPCC)
Number of FAILED tests 159
> Number of WRONG tests 38
> Number of CORRECT tests 2436
> Number of NEW tests 16
> Total number of tests 2649
> GREPME 159 38 2436 16 2649 X
Lots more of failed: influence of LIBGRID/smm_dnn ? Or maybe the files
compiled in -O1 aren't showed. Or since it's ins't the same compiler (XE
2015 vs XE 2013)
So I have some questions (first goal is no FAILED test while maintaining
the best speed (-O1 is clearly slower, but maybe the diff -O3 vs -O2 is
next to nothing, our cluster is small so we need to push it to the limit so
we went for -O3 first))
-Is something wrong in our arch file ?
-Someone managed to compile in -O3 (or -O2) with some files in -O1 (I
deduced graphcon.F and qs_vxc_atom.F must be compiled -O1, but maybe other,
or some in -O2) with intel compiler 2013 (14.0.x versions) and no big
errors ?
-O2 vs -O3 ?
-What can I do to see what's wrong in FAILED/segfault, -traceback -g, but I
what do I look for ? (I'm no expert !) or also what 'file.F' are included
in each regtest if it's possible to know easily for now ?
-Also I noticed big errors being different from -O3/-O2/-O1 (the 3 first
arch I used), and since that can I assume there is nothing wrong with
libint/libxc/mkl, just -Oflags ? :
-O3 + -O1 on graphcon.F and qs_vxc_atom.F (test -O3)
NEB/regtest-1/2gly_EB-NEB.inp.out
NEB/regtest-2/2gly_DIIS-SM.inp.out
NEB/regtest-2/2gly_DIIS-DNEB.inp.out
NEB/regtest-2/2gly_DIIS-NEB.inp.out
relative error : 2e-02 > numerical tolerance = 8e-12/-11/-13
Fist/regtest-3/water_2_TS_CG.inp.out
relative error : 2.21900214e-06 > numerical tolerance = 1.0E-14
QS/regtest-ri-mp2/opt_basis_O_auto_gen.inp.out
relative error : 6.54370492e-02 > numerical tolerance = 1e-04
QS/regtest-almo-2/FH-chain.inp.out
relative error : 2.00884032e-10 > numerical tolerance = 1e-13
QS/regtest-almo-1/almo-x.inp.out
QS/regtest-almo-1/almo-guess.inp.out
QS/regtest-almo-1/almo-scf.inp.out
relative error : 6e-12 > numerical tolerance = 4/7/8e-14
SE/regtest-3-4/Al2O3.inp.out
relative error : 2.51373362e-05 > numerical tolerance = 6e-14
-O2 + -O1 on graphcon.F and qs_vxc_atom.F (---> Same errors as test-O3)
(test -O2)
NEB/regtest-1/2gly_EB-NEB.inp.out
NEB/regtest-2/2gly_DIIS-SM.inp.out
NEB/regtest-2/2gly_DIIS-DNEB.inp.out
NEB/regtest-2/2gly_DIIS-NEB.inp.out
relative error : 2e-02 > numerical tolerance = 8e-12/-11/-13
Fist/regtest-3/water_2_TS_CG.inp.out :
relative error : 2.21900214e-06 > numerical tolerance = 1.0E-14
QS/regtest-ri-mp2/opt_basis_O_auto_gen.inp.out
relative error : 6.54370492e-02 > numerical tolerance = 1e-04
QS/regtest-almo-2/FH-chain.inp.out
relative error : 2.00884032e-10 > numerical tolerance = 1e-13
QS/regtest-almo-1/almo-x.inp.out
QS/regtest-almo-1/almo-guess.inp.out
QS/regtest-almo-1/almo-scf.inp.out
relative error : 6e-12 > numerical tolerance = 4/7/8e-14
SE/regtest-3-4/Al2O3.inp.out
relative error : 2.51373362e-05 > numerical tolerance = 6e-14
-O1 on all (---> Differents errors as test-O3/-O2) (test -O1)
QS/regtest-ps-implicit-1-3/Ar_mixed_planar.inp.out
relative error : 1.02615640e-09 > numerical tolerance = 1e-12
QS/regtest-ps-implicit-2-2/H2O_mixed_periodic_planar.inp.out :
relative error : 3.64315287e-07 > numerical tolerance = 1e-12
QS/regtest-ps-implicit-2-3/H2O_mixed_periodic_cylindrical.inp.out :
relative error : 3.99727816e-07 > numerical tolerance = 1e-12
QS/regtest-ps-implicit-1-2/Ar_mixed_periodic_planar.inp.out
relative error : 1.63406231e-06 > numerical tolerance = 1e-12
QS/regtest-admm-4/MD-1.inp.out
relative error : 6.79583221e-11 > numerical tolerance = 7e-13
QS/regtest-admm-4/MD-2_no_OT.inp.out
relative error : 1.05116397e-11 > numerical tolerance = 1.0E-14
Fist/regtest-3/2d_pot.inp.out
relative error : 2.56003763e-01 > numerical tolerance = 5e-06
Fist/regtest-1-2/deca_ala_reftraj.inp.out
relative error : 5.45234681e-12 > numerical tolerance = 1.0E-14
Fist/regtest-4/H2O-meta-combine.inp.out
relative error : 2.41671120e-02 > numerical tolerance = 1.0E-14
Any help/hint/info/experience will be well recieved.
Also we have gcc/gfortran on the cluster. Is intel faster for CP2K or
roughly the same as GCC ?
Thank you for your time if you've read all this !
Kind regards,
Rolf David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20150610/c6db8e6b/attachment.htm>
More information about the CP2K-user
mailing list