Compilations with Intel (XE 2013) for CP2K-trunk (2.7dev) & regtests errors

Rolf David rolf.d... at gmail.com
Wed Jun 10 10:38:25 UTC 2015


Hi all

I've encountered several problem with CP2K compilation (trunk-rev-15402, 
popt) with Intel Compiler/MPI/MKL (icc/ifort 14.0.2 : mpi 4.1 Update 2 : 
mkl 11.1.2)

First my "out of the box" arch file (libint is 1.1.4, libxc 2.0.1):
 

CC       = mpiicc
CPP      =
FC       = mpiifort
LD       = mpiifort
AR       = xiar -r
DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__MKL 
-D__FFTW3 -D__LIBINT -D__LIBXC2
CPPFLAGS =
FCFLAGS  = $(DFLAGS) $(INC) -O3 -axAVX -xSSE4.2 -heap-arrays 64 -funroll-loops 
-fpp -free
FCFLAGS2 = $(DFLAGS) $(INC) -O1 -axAVX -xSSE4.2 -heap-arrays 64 -fpp -free
LDFLAGS  = $(FCFLAGS)
LIBS = -L$(MKL_LIB) -Wl,-rpath,$(MKL_LIB) \
        -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 \
        -lmkl_sequential -lmkl_core \
        $(FFTW_LIB)/libfftw3xf_intel.a \
        $(LIBINT_LIB)/libderiv.a $(LIBINT_LIB)/libint.a -lstdc++ \
        $(LIBXC_LIB)/libxc.a \
        -lpthread -lm
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
        $(FC) -c $(FCFLAGS2) $<
# In order to avoid segv when HF exchange for example
qs_vxc_atom.o: qs_vxc_atom.F
        $(FC) -c $(FCFLAGS2) $<

We are calling it test-O3 
 

 Number of FAILED  tests 56
> Number of WRONG   tests 18
> Number of CORRECT tests 2559
> Number of NEW     tests 16
> Total number of   tests 2649
> GREPME 56 18 2559 16 2649 X


Most failed are regtesting Fist (regtest 
(-5)(-12)(-pol)(-6)(-15)(-1-3)(-4)(-1-2)(-2)(-8)(-9)(-11) (and 
/QS/regtest-ot/H2-BECKE-MD.inp, 
QMMM/SE/regtest/ mol_CSVR_gen*.inp,QMMM/SE/regtest_2/water_g3x3_excl_*m.inp)


If I do the same with -O2 instead of -O3 (test-O2)


Number of FAILED  tests 0
> Number of WRONG   tests 17
> Number of CORRECT tests 2616
> Number of NEW     tests 16
> Total number of   tests 2649
> GREPME 0 17 2616 16 2649 X


So I assume some files has to be compiled with -O1 (on top of the two ones 
with -O1) -> Fail segfault

And 10 errors are "unacceptable" (greater than one order : rel error 1e-13 
tolerence is 1-14 is considered it ok, but not 1e-12)


and -O1 instead of -O3 (test-O1)


Number of FAILED  tests 0
> Number of WRONG   tests 81
> Number of CORRECT tests 2568
> Number of NEW     tests 0
> Total number of   tests 2649
> GREPME 0 81 2568 0 2649 X


More wrong (9 are "unacceptable" but different from -O2)



Also I've tried the -O2 on all, and -O1 on two files : (as hinted by Iain 
Bethune in 
https://groups.google.com/forum/#!searchin/cp2k/intel$20$20after$3A2014$2F01$2F01/cp2k/YZ3gVI-6Au0/uJZC8QKSzxUJ) 
(test-IB)


Number of FAILED  tests 166
> Number of WRONG   tests 16
> Number of CORRECT tests 2467
> Number of NEW     tests 0
> Total number of   tests 2649
> GREPME 166 16 2467 0 2649 X


This setup is wrose thant the previous -O2/-O1 files. I assume this was 
only valid for 2.5.1 as in the post.


And also using the Arch files from 
(http://support.euforia-project.eu/phi/popt/regtest-arch, but without -D__HAS_smm_dnn 
-D__HAS_LIBGRID) (test-EPCC)


Number of FAILED  tests 159
> Number of WRONG   tests 38
> Number of CORRECT tests 2436
> Number of NEW     tests 16
> Total number of   tests 2649
> GREPME 159 38 2436 16 2649 X


Lots more of failed: influence of LIBGRID/smm_dnn ? Or maybe the files 
compiled in -O1 aren't showed. Or since it's ins't the same compiler (XE 
2015 vs XE 2013)



So I have some questions (first goal is no FAILED test while maintaining 
the best speed (-O1 is clearly slower, but maybe the diff -O3 vs -O2 is 
next to nothing, our cluster is small so we need to push it to the limit so 
we went for -O3 first))


-Is something wrong in our arch file ?

-Someone managed to compile in -O3 (or -O2) with some files in -O1 (I 
deduced graphcon.F and qs_vxc_atom.F must be compiled -O1, but maybe other, 
or some in -O2) with intel compiler 2013 (14.0.x versions)  and no big 
errors ?

-O2 vs -O3 ?

-What can I do to see what's wrong in FAILED/segfault, -traceback -g, but I 
what do I look for ? (I'm no expert !) or also what 'file.F' are included 
in each regtest if it's possible to know easily for now ?


-Also I noticed big errors being different from -O3/-O2/-O1 (the 3 first 
arch I used), and since that can I assume there is nothing wrong with 
libint/libxc/mkl, just -Oflags ? :


-O3 + -O1 on  graphcon.F and qs_vxc_atom.F (test -O3)

NEB/regtest-1/2gly_EB-NEB.inp.out 

NEB/regtest-2/2gly_DIIS-SM.inp.out 

NEB/regtest-2/2gly_DIIS-DNEB.inp.out

NEB/regtest-2/2gly_DIIS-NEB.inp.out 

relative error :   2e-02 >  numerical tolerance = 8e-12/-11/-13

Fist/regtest-3/water_2_TS_CG.inp.out 

 relative error :   2.21900214e-06 >  numerical tolerance = 1.0E-14

QS/regtest-ri-mp2/opt_basis_O_auto_gen.inp.out

relative error :   6.54370492e-02 >  numerical tolerance = 1e-04

QS/regtest-almo-2/FH-chain.inp.out

relative error :   2.00884032e-10 >  numerical tolerance = 1e-13

QS/regtest-almo-1/almo-x.inp.out

QS/regtest-almo-1/almo-guess.inp.out

QS/regtest-almo-1/almo-scf.inp.out 

 relative error :   6e-12 >  numerical tolerance = 4/7/8e-14

SE/regtest-3-4/Al2O3.inp.out

 relative error :   2.51373362e-05 >  numerical tolerance = 6e-14


-O2 + -O1 on  graphcon.F and qs_vxc_atom.F (---> Same errors as test-O3) 
(test -O2)

NEB/regtest-1/2gly_EB-NEB.inp.out 

NEB/regtest-2/2gly_DIIS-SM.inp.out 

NEB/regtest-2/2gly_DIIS-DNEB.inp.out

NEB/regtest-2/2gly_DIIS-NEB.inp.out 

relative error :   2e-02 >  numerical tolerance = 8e-12/-11/-13

Fist/regtest-3/water_2_TS_CG.inp.out :

relative error :   2.21900214e-06 >  numerical tolerance = 1.0E-14

QS/regtest-ri-mp2/opt_basis_O_auto_gen.inp.out

relative error :   6.54370492e-02 >  numerical tolerance = 1e-04

QS/regtest-almo-2/FH-chain.inp.out 

relative error :   2.00884032e-10 >  numerical tolerance = 1e-13

QS/regtest-almo-1/almo-x.inp.out

QS/regtest-almo-1/almo-guess.inp.out

QS/regtest-almo-1/almo-scf.inp.out 

 relative error :   6e-12 >  numerical tolerance = 4/7/8e-14

SE/regtest-3-4/Al2O3.inp.out

 relative error :   2.51373362e-05 >  numerical tolerance = 6e-14


-O1 on all (---> Differents errors as test-O3/-O2) (test -O1)

 QS/regtest-ps-implicit-1-3/Ar_mixed_planar.inp.out

 relative error :   1.02615640e-09 >  numerical tolerance = 1e-12

 QS/regtest-ps-implicit-2-2/H2O_mixed_periodic_planar.inp.out :

 relative error :   3.64315287e-07 >  numerical tolerance = 1e-12

QS/regtest-ps-implicit-2-3/H2O_mixed_periodic_cylindrical.inp.out :

  relative error :   3.99727816e-07 >  numerical tolerance = 1e-12

 QS/regtest-ps-implicit-1-2/Ar_mixed_periodic_planar.inp.out

  relative error :   1.63406231e-06 >  numerical tolerance = 1e-12

QS/regtest-admm-4/MD-1.inp.out

  relative error :   6.79583221e-11 >  numerical tolerance = 7e-13

QS/regtest-admm-4/MD-2_no_OT.inp.out

relative error :   1.05116397e-11 >  numerical tolerance = 1.0E-14

Fist/regtest-3/2d_pot.inp.out

relative error :   2.56003763e-01 >  numerical tolerance = 5e-06

Fist/regtest-1-2/deca_ala_reftraj.inp.out

 relative error :   5.45234681e-12 >  numerical tolerance = 1.0E-14

Fist/regtest-4/H2O-meta-combine.inp.out

 relative error :   2.41671120e-02 >  numerical tolerance = 1.0E-14



Any help/hint/info/experience will be well recieved.


Also we have gcc/gfortran on the cluster. Is intel faster for CP2K or 
roughly the same as GCC ?


Thank you for your time if you've read all this !


Kind regards,


Rolf David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20150610/c6db8e6b/attachment.htm>


More information about the CP2K-user mailing list