[CP2K:2946] Re: libdbcsr, MPI error
Pietro Vidossich
vi... at klingon.uab.es
Thu Nov 25 15:57:52 UTC 2010
Dear Roger,
I have the same problem (cannot run long jobs) on the Marenostrum in
Barcelona. The system admins at the Barcelona supercomputing center concluded
that the problem is in the MPI-2 implementation on Marenostrum (which
apparently is not standard). They have not been able to make it work properly.
i see you have a similar machine, possibly you face the same problem. in case
you solve it in madrid, please let me know. regards, pietro
On Thu, 25 Nov 2010 06:53:52 -0800 (PST), nadler wrote
> The error still comes up after several intents. Current version
> installed is 2.2.45. Furthermore, as mentioned in the first post:
> independently of the number of cpus chosen, the execution stops after
> a certain amount of cpu hours. In my case it is around 100 +/-10
> hours; the guy from the support team told me that the same happens to
> him after 163 +/-2 cpu hours, using the same input file I am using.
> Any ideas about what could be the problem? Following, I put
> informations about the clusters, compiler and current archfile.
> Thanks!
>
> Compiler: IBM XL Fortran for Linux, V12.1
>
> The information about the machines:
> Once, 1036 nodes of eServer BladeCenter JS20, having 2 PPC cpus
> (2.2GHz) with 4GB RAM per node.
> Then, 168 nodes of eServer BladeCenter JS21, having 4 PPC cpus
> (2.3GHz) with 8GB RAM.
> Communication occurs via Myrinet.
>
> The current archfile is:
>
> PERL = perl
> CC = xlc
> CPP = cpp
> FC = xlf95_r -qsuffix=f=F
> LD = xlf95_r
> AR = ar -r
> DFLAGS = -D__AIX -D__ESSL -D__FFTSG -D__FFTW3 -D__parallel
> -D__BLACS -D__SCALAPACK -D__LIBINT CPPFLAGS = -C $(DFLAGS) -P -
> traditional \ -I/gpfs/apps/FFTW/3.2.1/64/include FCFLAGS = -O2
> -qstrict -q64 -qarch=ppc970 -qcache=auto -qmaxmem=-1 - qtune=ppc970
> \ -I/gpfs/apps/FFTW/3.2.1/64/include \
> -I/gpfs/apps/LIBINT/1.1.4/64/include \
-I/gpfs/apps/MPICH2/mx/1.0.8p1..3/64/include
> FCFLAGS2 = -O0 -qstrict -q64 -qarch=ppc970 -qcache=auto -qmaxmem=-1 -
> qtune=ppc970 \
> -I/gpfs/apps/FFTW/3.2.1/64/include \
> -I/gpfs/apps/LIBINT/1.1.4/64/include \
> -I/gpfs/apps/MPICH2/mx/1.0.8p1..3/64/include
> LDFLAGS = $(FCFLAGS) \
> -L/gpfs/apps/LAPACK/3.2.1/64/lib \
> -L/gpfs/apps/SCALAPACK/1.8/mpich2/64 \
> -L/gpfs/apps/FFTW/3.2.1/64/lib \
> -L/gpfs/apps/LIBINT/1.1.4/64/lib \
> -L/opt/ibmcmp/xlmass/5.0/lib64 \
> -L/gpfs/apps/MPICH2/mx/1.0.8p1..3/64/lib \
> -L/gpfs/apps/MPICH2/slurm/64/lib \
> -L/opt/osshpc/mx/lib64 \
> -L/usr/lib64
> LIBS = -lscalapack \
> /gpfs/apps/SCALAPACK/1.8/mpich2/64/blacs.a \
> -lmass_64 \
> -lmpich -lpmi -lmyriexpress -lpthread \
> -llapack -lessl -lfftw3f -lfftw3 -lint -lderiv
>
> OBJECTS_ARCHITECTURE = machine_aix.o
>
> ### To speed up compilation time ###
> pint_types.o: pint_types.F
> $(FC) -c $(FCFLAGS2) $<
> md_run.o: md_run.F
> $(FC) -c $(FCFLAGS2) $<
> kg_energy.o: kg_energy.F
> $(FC) -c $(FCFLAGS2) $<
> integrator.o: integrator.F
> $(FC) -c $(FCFLAGS2) $<
> geo_opt.o: geo_opt.F
> $(FC) -c $(FCFLAGS2) $<
> qmmm_init.o: qmmm_init.F
> $(FC) -c $(FCFLAGS2) $<
> cp2k_runs.o: cp2k_runs.F
> $(FC) -c $(FCFLAGS2) $<
> mc_ensembles.o: mc_ensembles.F
> $(FC) -c $(FCFLAGS2) $<
> ep_methods.o: ep_methods.F
> $(FC) -c $(FCFLAGS2) $<
> mc_ge_moves.o: mc_ge_moves.F
> $(FC) -c $(FCFLAGS2) $<
> force_env_methods.o: force_env_methods.F
> $(FC) -c $(FCFLAGS2) $<
> cp_lbfgs_optimizer_gopt.o: cp_lbfgs_optimizer_gopt.F
> $(FC) -c $(FCFLAGS2) $<
> mc_types.o: mc_types.F
> $(FC) -c $(FCFLAGS2) $<
> f77_interface.o: f77_interface.F
> $(FC) -c $(FCFLAGS2) $<
> mc_moves.o: mc_moves.F
> $(FC) -c $(FCFLAGS2) $<
>
> --
> You received this message because you are subscribed to the Google
> Groups "cp2k" group. To post to this group, send email to cp... at googlegroups.com.
> To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cp2k?hl=en.
More information about the CP2K-user
mailing list