[CP2K:9405] Re: running cuda-enabled cp2k on multiple nodes with aprun/HPC

Ada Sedova ada.a.... at gmail.com
Thu Sep 14 15:27:27 UTC 2017


Hi,

I get 3 cp2k processes when I try this, plus the grep process. I also got
the "[1]+ Stopped  aprun -n 4 (etc.)" message again with this config, while
the job keeps running. I actually had this same thing happen using another
program yesterday, so it may just be something in the interactive mode on
Titan, with multiple processes and the GPU, which I haven't noticed yet. I
will do some more research. The jobs complete correctly, it seems, and so
far, 10 steps of H2O-32 finishes in about 28 minutes on 1 node, and about
18 minutes on 2 nodes. I may need to tweak the number of processes I ask
for, and the threading, etc. Any advice on this would be great.

Now, I was wondering about the maximal acceleration for cp2k. Does the use
of libcusmm take the place of libsmm, or should I still rebuild with libsmm
as well? Also, what about ELPA and the other optional libraries that
increase performance? I really am mostly concerned with linear-scaling DFT
for AIMD. But I am also concerned with dispersion corrections, so I don't
know how well the two work together. Also, I did not build with libsci_acc,
which I probably should do. Can you tell me everything that should be
included in the build to get maximally-scaling DFT-MD?

Thanks so much,

Ada

On Thu, Sep 14, 2017 at 1:44 AM, Andreas Glöss <andreas... at gmail.com>
wrote:

> Dear Ada,
>
> Can't see any obvious mistake, and here at CSCS/PizDaint we only have pure
> SLURM (srun), so let's start systematic:
>
> 1) Swap the order of module load/unload, like this:
> module swap PrgEnv-pgi/5.2.82 PrgEnv-gnu
> module load fftw
> module load cudatoolkit
> , maybe there is a mistake in Cray's modules, and distclean & recompile
> CP2K.
>
> 2) Start an interactive session for just one node with:
> qsub -I -A stf006 -l nodes=1,walltime=00:40:00
>
> 3) Start CP2K using 4 MPI-Ranks and 1 OMP-Thread/Rank with:
> module swap PrgEnv-pgi/5.2.82 PrgEnv-gnu
> module load cudatoolkit
> module load fftw
> export CRAY_CUDA_PROXY=1
> export OMP_NUM_THREADS=1
> aprun -n 4 /ccs/home/adaa/cp2k-titan-gnu4.9.3/cp2k/exe/titan_gnu_cuda/cp2k.psmp
> -i H2O-32.inp -o H2O-32.out
>
> 4) Use ps to determine the number of cp2k.psmp processes running (not
> aprun processes) with:
> ps -ef| grep cp2k.psmp
>
> What's the outcome - 4 or more?
>
> Best regards,
> Andreas
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/cp2k/zlt69l2xaqc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> cp2k+uns... at googlegroups.com.
> To post to this group, send email to cp... at googlegroups.com.
> Visit this group at https://groups.google.com/group/cp2k.
> For more options, visit https://groups.google.com/d/optout.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170914/4405d848/attachment.htm>


More information about the CP2K-user mailing list