QM/MM scaling problem
Chenghan
lch0... at gmail.com
Fri Jun 16 20:52:12 UTC 2017
Hi,
I am running QM/MM MD on a system containing ~210K atoms and ~110 QM atoms.
I got a speed of 150 s/step on one node of 28 Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz using the popt version of cp2k-3.0. When I tried 2 nodes and
4 nodes I still got ~150 s/step.
I also compiled the psmp version and tested it with several OMP settings on
two nodes and finally I got the optimum when using 1 Thread and 56 MPI
ranks.
These tests were performed along with plumed-2.2.4 but turning off plumed
did not make too much change to the speed.
Any advice is appreciated.
Best,
Chenghan
The timing profile for a short run is:
-------------------------------------------------------------------------------
-
-
- T I M I N G
-
-
-
-------------------------------------------------------------------------------
SUBROUTINE CALLS ASD SELF TIME TOTAL
TIME
MAXIMUM AVERAGE MAXIMUM AVERAGE
MAXIMUM
CP2K 1 1.0 0.296 0.422 1869.596
1869.598
qs_mol_dyn_low 1 2.0 0.020 0.035 1836.976
1839.963
velocity_verlet 10 3.0 0.096 0.100 1662.628
1662.773
qmmm_forces 11 3.9 429.413 894.165 1083.014
1083.020
qmmm_forces_with_gaussian 11 4.9 0.061 0.078 652.757
1081.757
qmmm_force_with_gaussian_low 11 5.9 0.000 0.001 645.660
1074.663
qmmm_forces_gaussian_low_R 11 6.9 0.000 0.000 611.736
1039.217
qmmm_forces_with_gaussian_LG 11 7.9 611.736 1039.217 611.736
1039.217
qmmm_el_coupling 11 3.9 0.000 0.000 647.601
649.152
qmmm_elec_with_gaussian 11 4.9 241.206 492.628 647.581
649.132
qmmm_elec_with_gaussian_low 11 5.9 0.000 0.000 400.905
642.050
qmmm_elec_gaussian_low_R 11 6.9 0.000 0.000 367.338
606.291
qmmm_elec_with_gaussian_LG 11 7.9 367.338 606.291 367.338
606.291
qs_forces 11 3.9 0.030 0.042 89.895
89.917
qs_energies 11 4.9 0.004 0.007 81.215
81.237
scf_env_do_scf 11 5.9 0.001 0.004 73.334
73.336
scf_env_do_scf_inner_loop 63 6.9 0.003 0.010 60.146
60.357
rebuild_ks_matrix 74 8.4 0.000 0.000 46.092
46.178
qs_ks_build_kohn_sham_matrix 74 9.4 0.046 0.058 46.092
46.177
qs_ks_update_qs_env 74 7.9 0.001 0.001 37.734
37.806
-----------
This is my arch file for compiling the popt:
include /home/chhli/local/lib/plumed/src/lib/Plumed.inc
CC = mpiicc
CPP =
FC = mpiifort
LD = mpiifort
AR = xiar -r
INTEL_MKL= $(MKLROOT)
INTEL_INC = $(MKLROOT)/include/fftw
INTEL_INC2 = $(MKLROOT)/include
INTEL_LIB = $(MKLROOT)/lib/intel64
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK
-D__FFTW3 -D__FFTMKL -D__PLUMED2
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC) -I$(INTEL_INC2)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -I$(INTEL_INC2) -O2 -xHost -heap-arrays
64 -funroll-loops -fpp -free
FCFLAGS2 = -I$(INTEL_INC) -I$(INTEL_INC2) -O1 -xHost -heap-arrays 64 -fpp
-free $(DFLAGS)
LDFLAGS = $(FCFLAGS)
LIBS = -L$(INTEL_LIB) -Wl,--start-group \
$(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_sequential.a \
$(MKLROOT)/lib/intel64/libmkl_core.a \
-Wl,--end-group -lpthread -lm -lz -ldl -lstdc++ \
-lplumed -L/home/chhli/local/lib
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
et_coupling.o: et_coupling.F
$(FC) -c $(FCFLAGS2) $<
qs_vxc_atom.o: qs_vxc_atom.F
$(FC) -c $(FCFLAGS2) $<
-----------
This is the arch file I used to compile the psmp version:
include /home/chhli/local/lib/plumed/src/lib/Plumed.inc
#EXTERNAL_OBJECTS=$(PLUMED_STATIC_DEPENDENCIES)
CC = mpiicc
CPP =
FC = mpiifort
LD = mpiifort
AR = xiar -r
INTEL_MKL= $(MKLROOT)
INTEL_INC = $(MKLROOT)/include/fftw
INTEL_INC2 = $(MKLROOT)/include
INTEL_LIB = $(MKLROOT)/lib/intel64
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK
-D__FFTW3 -D__FFTMKL -D__PLUMED2 -D__MKL
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC) -I$(INTEL_INC2)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -I$(INTEL_INC2) -O2 -xHost -heap-arrays
64 -funroll-loops -fpp -free -openmp
FCFLAGS2 = -I$(INTEL_INC) -I$(INTEL_INC2) -O1 -xHost -heap-arrays 64 -fpp
-free $(DFLAGS) -openmp
LDFLAGS = $(FCFLAGS)
LIBS = -L$(INTEL_LIB) -Wl,--start-group \
$(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_sequential.a \
$(MKLROOT)/lib/intel64/libmkl_core.a \
-Wl,--end-group -lpthread -lm -lz -ldl -lstdc++ \
-lplumed -L/home/chhli/local/lib -openmp
OBJECTS_ARCHITECTURE = machine_intel.o
graphcon.o: graphcon.F
$(FC) -c $(FCFLAGS2) $<
et_coupling.o: et_coupling.F
$(FC) -c $(FCFLAGS2) $<
qs_vxc_atom.o: qs_vxc_atom.F
$(FC) -c $(FCFLAGS2) $<
----------
The libraries I am using:
fftw3/3.3.5
intel/16.0
intelmpi/5.1
libmatheval/1.1
mkl/2017
plumed/2.2.4
----------
The input I am using:
@set dir ../dft/
&FORCE_EVAL
METHOD QMMM
&DFT
BASIS_SET_FILE_NAME ${dir}GTH_BASIS_SETS
POTENTIAL_FILE_NAME ${dir}GTH_POTENTIALS
!HERE
MULTIPLICITY 1
CHARGE -0
&MGRID
COMMENSURATE
CUTOFF 360
&END MGRID
&QS
METHOD GPW
EXTRAPOLATION ASPC
EXTRAPOLATION_ORDER 2
&END QS
WFN_RESTART_FILE_NAME SERCA-RESTART.wfn
&SCF
EPS_SCF 1.0E-6
MAX_SCF 400
SCF_GUESS RESTART
&OUTER_SCF
EPS_SCF 1.0E-6
MAX_SCF 100
&END
&OT
MINIMIZER DIIS
PRECONDITIONER FULL_ALL
ENERGY_GAP 0.001
STEPSIZE 0.10
&END OT
&END SCF
&XC
&XC_FUNCTIONAL BLYP
&END XC_FUNCTIONAL
&vdW_POTENTIAL
POTENTIAL_TYPE PAIR_POTENTIAL
&PAIR_POTENTIAL
TYPE DFTD3
PARAMETER_FILE_NAME ${dir}dftd3.dat
REFERENCE_FUNCTIONAL BLYP
&END PAIR_POTENTIAL
&END vdW_POTENTIAL
&XC_GRID
XC_SMOOTH_RHO SPLINE2
XC_DERIV SPLINE2_SMOOTH
&END XC_GRID
&END XC
&END DFT
&MM
&FORCEFIELD
PARMTYPE CHM
PARM_FILE_NAME ${dir}force_fields/par_charmm36.prm
EI_SCALE14 1.0
VDW_SCALE14 1.0
&SPLINE
EMAX_SPLINE 1.0E10
RCUT_NB 12.0
&END SPLINE
&END FORCEFIELD
&NEIGHBOR_LISTS
GEO_CHECK F
&END NEIGHBOR_LISTS
&POISSON
&EWALD
EWALD_TYPE spme
GMAX 120 120 150
&END EWALD
&END POISSON
&END MM
&SUBSYS
&CELL
ABC 116.801 116.801 153.240
&END CELL
&TOPOLOGY
CONN_FILE serca_e908.psf
CONNECTIVITY PSF
COORD_FILE_NAME serca_e908.pdb
COORD_FILE_FORMAT PDB
&END TOPOLOGY
&KIND H
ELEMENT H
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q1
&END KIND
&KIND O
ELEMENT O
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q6
&END KIND
&KIND C
ELEMENT C
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q4
&END KIND
&KIND N
ELEMENT N
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q5
&END KIND
&KIND S
ELEMENT S
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q6
&END KIND
&KIND CL
ELEMENT Cl
BASIS_SET TZV2P-GTH
POTENTIAL GTH-BLYP-q7
&END KIND
&END SUBSYS
&QMMM
PARALLEL_SCHEME GRID
&CELL
ABC 18 24 32
PERIODIC XYZ
&END CELL
E_COUPL GAUSS
NOCOMPATIBILITY
USE_GEEP_LIB 12
&MM_KIND O
RADIUS 0.7
&END MM_KIND
&MM_KIND OH1
RADIUS 0.7
&END MM_KIND
&MM_KIND OT
RADIUS 1.2
&END MM_KIND
&MM_KIND HT
RADIUS 0.4
&END MM_KIND
&MM_KIND OTH
RADIUS 1.2
&END MM_KIND
&MM_KIND HTH
RADIUS 0.4
&END MM_KIND
&MM_KIND CLA
RADIUS 1.0
&END MM_KIND
&MM_KIND H
RADIUS 0.3
&END MM_KIND
&MM_KIND HA
RADIUS 0.3
&END MM_KIND
&MM_KIND HB
RADIUS 0.3
&END MM_KIND
&MM_KIND C
RADIUS 0.7
&END MM_KIND
&MM_KIND CT1
RADIUS 0.7
&END MM_KIND
&MM_KIND CT2
RADIUS 0.7
&END MM_KIND
&MM_KIND CT3
RADIUS 0.7
&END MM_KIND
&MM_KIND CC
RADIUS 0.7
&END MM_KIND
&MM_KIND NH2
RADIUS 0.7
&END MM_KIND
&MM_KIND NH1
RADIUS 0.7
&END MM_KIND
&MM_KIND OC
RADIUS 0.7
&END MM_KIND
&MM_KIND CC
RADIUS 0.7
&END MM_KIND
&MM_KIND CA
RADIUS 0.7
&END MM_KIND
&MM_KIND HP
RADIUS 0.3
&END MM_KIND
&MM_KIND CTC
RADIUS 0.7
&END MM_KIND
&MM_KIND NTC
RADIUS 0.7
&END MM_KIND
&MM_KIND NO
RADIUS 0.7
&END MM_KIND
&MM_KIND ON
RADIUS 0.7
&END MM_KIND
&PERIODIC
GMAX .5
&MULTIPOLE
RCUT 12.0
&END MULTIPOLE
&END PERIODIC
&WALLS
TYPE QUADRATIC
K 0.1
WALL_SKIN 2 2 2
&END WALLS
&QM_KIND H
MM_INDEX 12270 12271 12274 12276 12280 12282 12284 12286 12343 12345
12346 12347 12349 12350 12351 12359 12361 12363 12364 12365 13952 13954
13955 13956 13958 13959 13960 14001 14002 14004 14005 14016 14017 14019
14020 14023 14024 14025 14506 14507 14509 14564 14565 14567 14570 14573
77405 77406 83612 83613 90101 90102 127163 127164 129230 129231 132068
132069 133784 133785 151466 151467 155483 155484 180986 180987 209072
209073 214082 214083 214084
&END QM_KIND
&QM_KIND O
MM_INDEX 12360 14007 14008 14508 77404 83611 90100 127162 129229
132067 133783 151465 155482 180985 209071 214081
&END QM_KIND
&QM_KIND N
MM_INDEX 12275 14566 14571
&END QM_KIND
&QM_KIND C
MM_INDEX 12269 12272 12273 12277 12278 12279 12281 12283 12285 12342
12344 12348 12358 12362 13951 13953 13957 14000 14003 14006 14015 14018
14022 14505 14563 14568 14569 14572
&END QM_KIND
&QM_KIND S
MM_INDEX 14021
&END QM_KIND
&LINK
QM_INDEX 12269
MM_INDEX 12267
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 12342
MM_INDEX 12340
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 12358
MM_INDEX 12356
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 13951
MM_INDEX 13949
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 14000
MM_INDEX 13998
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 14015
MM_INDEX 14013
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 14505
MM_INDEX 14503
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&LINK
QM_INDEX 14563
MM_INDEX 14561
QM_KIND H
LINK_TYPE IMOMM
ALPHA_IMOMM 1.5
&END LINK
&END QMMM
&END FORCE_EVAL
&GLOBAL
PROJECT SERCA
RUN_TYPE MD
PRINT_LEVEL LOW
&END GLOBAL
&MOTION
&FREE_ENERGY
&METADYN
USE_PLUMED T
PLUMED_INPUT_FILE ./plumed.dat
&END METADYN
&END FREE_ENERGY
&MD
ENSEMBLE NVT
COMVEL_TOL 1.0E-5
STEPS 1000000
TIMESTEP 0.5
TEMPERATURE 310
&THERMOSTAT
TYPE NOSE
&NOSE
TIMECON 100
&END NOSE
&END THERMOSTAT
&PRINT
&ENERGY SILENT
&EACH
MD 100
&END
FILENAME MDENER.out
&END ENERGY
&END PRINT
&END MD
&GEO_OPT
OPTIMIZER CG
MAX_ITER 200
&END GEO_OPT
&PRINT
&TRAJECTORY
FILENAME NVT
&EACH
MD 50
&END
&END TRAJECTORY
&VELOCITIES OFF
&END VELOCITIES
&RESTART SILENT
&EACH
MD 5
&END
&END
&END PRINT
&END MOTION
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170616/91bfbf41/attachment.htm>
More information about the CP2K-user
mailing list