[CP2K-user] [CP2K:14777] CP2K 8.1: Runtime error running on more than one node
Tiziano Müller
tiziano... at chem.uzh.ch
Tue Feb 16 08:09:19 UTC 2021
Hi Mauro,
there are some things you can try:
* build a CP2K without COSMA, ELPA, SIRIUS, SpFFT and see whether you
can reproduce it with that (simply create an arch-file as a copy of the
toolchain-generated ones with the respective -D..., -l..., -I... flags
removed)
* check the toolchain build logs of ELPA and COSMA (cmake.log/make.log)
to verify that the correct MPI implementation was picked up during
compilation, see also [1]
My guess since other (less complex) software works is that either there
is/was a mismatch of OpenMPI implementations and/or that one of the
dependencies tries to do some advanced MPI ops (RMA, I/O) which is not
configured on your cluster (CP2K has RMA off by default).
Best,
Tiziano
[1] https://github.com/cp2k/cp2k/issues/1351
On 2/15/21 10:07 PM, Mauro Sgroi wrote:
> Dear Alin,
> thanks a lot for the reply.
> With those mpi we are running smoothly Quantum Espresso and other codes.
> So it seems strange to find this error.
> Best regards,
> Mauro Sgroi.
> Il giorno lunedì 15 febbraio 2021 alle 18:43:31 UTC+1 al... at gmail.com
> ha scritto:
>
> Hi Mauro,
>
> seen this before with other codes... the issue invariable ended up due
> to openmpi not playing nicely with the queuing system.
>
> if you want to see if a simple MPI hello world using the above mpi
> actually works....
>
> Regards,
> Alin
> Without Questions there are no Answers!
> ______________________________________________________________________
> Dr. Alin Marin ELENA
> http://alin.elena.space/ <http://alin.elena.space/>
> ______________________________________________________________________
>
> On Mon, 15 Feb 2021 at 17:25, Mauro Sgroi
> <mauro... at gmail.com> wrote:
> >
> > Dear Developers,
> > I'm writing to ask you help for setting CP2K on a HPC server.
> > Our IT compiled the code using gcc-8.3.0, openmpi 4.1.0 and the
> toolchain:
> >
> > cmake-3.18.5 cosma-2.2.0 elpa-2020.05.001 fftw-3.3.8 gsl-2.6
> hdf5-1.12.0 libint-v2.6.0-cp2k-lmax-5 libvdwxc-0.4.0 libvori-201229
> libxc-4.3.4 libxsmm-1.16.1 openblas-0.3.10 scalapack-2.1.0
> sirius-7.0.0 SpFFT-0.9.13 spglib-1.16.0
> >
> > The obtained code works fine on a single node of our server. When
> we launch it with the command:
> > mpiexec --prefix ${MPI_INST_DIR} -x PATH -x LD_LIBRARY_PATH -n
> $CORES -machinefile machine_file
> ${INST_DIR}/cp2k-8.1/exe/local/cp2k.${SOLVER} -inp ${INPUT} 2>&1
> >
> > on more than one node we get the error:
> >
> > [btl_openib_component.c:1699:init_one_device] error obtaining
> device attributes for mlx5_0 errno says Protocol not supported
> >
> > Could you please give us some advice?
> >
> > Thanks a lot in advance and best regards,
> >
> > Mauro Sgroi.
> > Centro Ricerche FIAT.
> > Italy.
> >
> > --
> > You received this message because you are subscribed to the
> Google Groups "cp2k" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send an email to cp... at googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/cc9bb9af-4e72-4667-ad33-ef04edf7a70bn%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/cc9bb9af-4e72-4667-ad33-ef04edf7a70bn%40googlegroups.com>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to cp... at googlegroups.com
> <mailto:cp... at googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/7b611401-f960-464c-9394-0b3add2aad6dn%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/7b611401-f960-464c-9394-0b3add2aad6dn%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
Tiziano Müller
University of Zurich
Department of Chemistry
Winterthurerstrasse 190
CH-8057 Zürich
Tel: +41 44 63 54234
www.chem.uzh.ch
tiziano... at chem.uzh.ch
More information about the CP2K-user
mailing list