[CP2K-user] [CP2K:19395] Help with running cp2k in parallel with Slurm

Ava Rajh ava.rajh at gmail.com
Thu Oct 19 12:24:23 UTC 2023


Great, thank you very much for all the help.

best, Ava

V V čet., 19. okt. 2023 ob 14:21 je oseba Krack Matthias <
matthias.krack at psi.ch> napisala:

> So my guess would be that I should configure my version of OpenMPI to work
> with slurm?
>
>
>
> Yes, your OpenMPI installation should be configured with PMI support
> (--with-pmi flag) to work with srun. That issue is not related to
> container, since both SLURM and OpenMPI are on the host side.
>
>
>
> *From: *cp2k at googlegroups.com <cp2k at googlegroups.com> on behalf of Ava
> Rajh <ava.rajh at gmail.com>
> *Date: *Thursday, 19 October 2023 at 13:56
> *To: *cp2k at googlegroups.com <cp2k at googlegroups.com>
> *Subject: *Re: [CP2K:19393] Help with running cp2k in parallel with Slurm
>
> Hi,
>
>
>
> I tested the provided docker. When running in the container it again works
> great, outside the container  starting the job with mpirun -n works and
> sets the number of processes, but starting with srun (either through bash
> file or inline) produces the following error:
>
>
>
> --------------------------------------------------------------------------
> The application appears to have been direct launched using "srun",
> but OMPI was not built with SLURM's PMI support and therefore cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
>
>   version 16.05 or later: you can use SLURM's PMIx support. This
>   requires that you configure and build SLURM --with-pmix.
>
>   Versions earlier than 16.05: you must use either SLURM's PMI-1 or
>   PMI-2 support. SLURM builds PMI-1 by default, or you can manually
>   install PMI-2. You must then build Open MPI using --with-pmi pointing
>   to the SLURM PMI library location.
>
> Please configure as appropriate and try again.
> --------------------------------------------------------------------------
>
>
>
> So my guess would be that I should configure my version of OpenMPI to work
> with slurm?
>
>
>
> thanks for all the help, Ava
>
>
>
> V V čet., 19. okt. 2023 ob 12:46 je oseba Krack Matthias <
> matthias.krack at psi.ch> napisala:
>
> Hi Ava
>
>
>
> Thanks for testing. Indeed, the .sif file does not seem to work with the
> host MPI properly. These .sif files were directly built with apptainer.
> This was a test and we will not provide such containers in the future any
> longer. Instead, I suggest the download of a cp2k production docker
> container. Could you test such a docker container with apptainer? You can
> download it with the command
>
> apptainer pull docker://mkrack/cp2k:2023.2_mpich_generic_psmp
>
> or
>
> apptainer pull docker://mkrack/cp2k:2023.2_openmpi_generic_psmp
>
> depending on your host MPI version.
>
> A list of the available docker containers can be found here
> <https://hub.docker.com/repository/docker/mkrack/cp2k/tags?page=1&ordering=last_updated>
> .
>
>
>
> Best
>
>
>
> Matthias
>
>
>
> *From: *cp2k at googlegroups.com <cp2k at googlegroups.com> on behalf of Ava
> Rajh <ava.rajh at gmail.com>
> *Date: *Thursday, 19 October 2023 at 12:17
> *To: *cp2k <cp2k at googlegroups.com>
> *Subject: *Re: [CP2K:19391] Help with running cp2k in parallel with Slurm
>
>
>
> Sorry for the confusion, I read the output wrong,
>
> and if I put the ntasks in the header of the batch file I get the same
> result as running just with srun, so the number of tasks determines the
> number of programs running with one MPI rank each.
>
>
>
> On Thursday, October 19, 2023 at 10:20:39 AM UTC+2 Ava Rajh wrote:
>
>
>
> Hi Matthias,
>
>
>
> Thank you for the reply
>
>
>
> I tried both, first just with the srun command and as part of the #SBATCH
> file.
>
> If I run the "srun" command and specify the number of tasks I get the
> before mentioned multiple copies of the same program with 1 MPI rank each.
> The same happens if I run the command outside the container without the
> srun and specify the number of tasks with just: mpiexec -n 4
>
>
>
> and if I put the ntasks in the header of the batch file, then I just get
> one copy of the program with 1 MPI rank, no matter how many -ntasks are
> defined.
>
>
>
> Any idea about how to start my calculation would be very helpful, thank
> you.
>
> kind regards, Ava
>
>
>
>
>
> On Wednesday, October 18, 2023 at 7:21:22 PM UTC+2 Krack Matthias wrote:
>
> Hi Ava
>
>
>
> Do you run the “srun” command as part of a SLURM batch job file with a
> #SBATCH header section or interactively?
>
> Your guess is right, the –ntasks flag defines the number of MPI ranks.
>
>
>
> Best
>
>
>
> Matthias
>
>
>
> *From: *cp... at googlegroups.com <cp... at googlegroups.com> on behalf of Ava
> Rajh <ava.... at gmail.com>
> *Date: *Wednesday, 18 October 2023 at 14:17
> *To: *cp2k <cp... at googlegroups.com>
> *Subject: *[CP2K:19379] Help with running cp2k in parallel with Slurm
>
> Dear all,
>
> I am trying to run cp2k on our HPC cluster and I am new in doing any kind
> of parallel computing and work on a cluster, so I would appreciate if some
> help and I apologize if I am missing something obvious.
>
> I am trying to use Cp2k in combination with apptainer  and I followed
> instructions at https://github.com/cp2k/cp2k/tree/master/tools/apptainer
>
>
>
> I have a .sif file in my work directory and if I work within it (Running
> MPI within the container), everything works perfectly and I can set the
> number of MPI threads.
>
>
>
> But when trying to run it through slurm, I can't seem to be able to set
> the number of MPI processes per node. If i start a command like:
>
> srun --ntasks=2 apptainer run -B $PWD cp2k-2023.2_mpich_generic_psmp.sif
> cp2k -i H2O-32.in
>
> it just starts 2 instances of the program that run at the same time, and
> for each, the total number of message passing processes is 1. I am able to
> set and change the number of OpenMP threads though.
>
> So my question would be first, am I wrongly assuming that --ntasks X
> should correspond to the number of MPI threads? And if I am, how would I
> set it.
>
>
>
> Please let me know if I need to provide any more information to diagnose
> the issue.
>
> Thank you very much for the help and kind regards,
>
> Ava Rajh
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+uns... at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/2393b350-7de0-4599-97e0-4f52ffec0c5en%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/2393b350-7de0-4599-97e0-4f52ffec0c5en%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/1a89bd22-860c-4922-a2b6-bd20232ca770n%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/1a89bd22-860c-4922-a2b6-bd20232ca770n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/cp2k/or6vb6En6aU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/ZRAP278MB0827D6DA09823659200590A5F4D4A%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM
> <https://groups.google.com/d/msgid/cp2k/ZRAP278MB0827D6DA09823659200590A5F4D4A%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/CADBW8Us_6gVF8ABCEaUF%2B9VpK_84T8%3DNcU%2BLONq491L9DkwNVw%40mail.gmail.com
> <https://groups.google.com/d/msgid/cp2k/CADBW8Us_6gVF8ABCEaUF%2B9VpK_84T8%3DNcU%2BLONq491L9DkwNVw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/cp2k/or6vb6En6aU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/ZRAP278MB08274830D6F7F0D6F47075B0F4D4A%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM
> <https://groups.google.com/d/msgid/cp2k/ZRAP278MB08274830D6F7F0D6F47075B0F4D4A%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/CADBW8UvE-Up0b6ZXtOgsmi%2Bdv7AT8QqTosvZGWxp9N5D%3DU0r2Q%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20231019/bc1f36c4/attachment-0001.htm>


More information about the CP2K-user mailing list