The error in the log says that COSMA is used:<div><br></div><div><div>#7  0x2dfb43b in check_runtime_status</div><div><span style="white-space:pre">     </span>at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/util.hpp:17</div><div>#8  0x2dfb43b in _ZNK3gpu13device_stream13enqueue_eventEv</div><div><span style="white-space:pre">       </span>at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/device_stream.hpp:62</div><div>#9  0x2dfb43b in _ZN3gpu11round_robinIdEEvRNS_12tiled_matrixIT_EES4_S4_RNS_13device_bufferIS2_EES7_S7_iiiS2_S2_RNS_9mm_handleIS2_EE</div><div><span style="white-space:pre">    </span>at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/tiled_mm.cpp:248</div><div><br></div><div>....</div><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno venerdì 23 aprile 2021 alle 10:00:35 UTC+2 ASSIDUO Network ha scritto:<br/></div><blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr">Dear Fabian. COSMA wasn't installed with CP2K, so that can't be the issue. The HPC system is not CRAY, but I did ask the HPC admin to look into it.</div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 8:02 PM <a href data-email-masked rel="nofollow">fa...@gmail.com</a> <<a href data-email-masked rel="nofollow">fa...@gmail.com</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Hi,</div><div><br></div><div>cp2k is crashing when COSMA tries to access a gpu ("<span>error: GPU API call : invalid resource handle</span>"). On cray systems there is the environment variable "export CRAY_CUDA_MPS=1" that has to be set. Otherwise only one mpi rank can access a specific GPU device. Maybe there is a similar setting for your cluster?</div><div><br></div><div>Also cp2k can be memory hungry. Setting "ulimit -s unlimited" is often needed.</div><div><br></div><div>I hope this helps,</div><div>Fabian<br></div><br><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Thursday, 22 April 2021 at 19:36:35 UTC+2 ASSIDUO Network wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Oh you meant the error file. Please find it attached.<br><br>I have run on CPU only and one GPU. It works.</div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 7:31 PM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I'm sorry, I cannot assist you, I'm not an expert on how to use CP2K ('m not a domain scientist). Without the total log, I can help you...<div>I assume you should have a log file from PBS where you can see the error message. I can assume it is a memory limit.</div><div>Have you executed on a CPU only?<br><div><div><br></div><div><br><br></div></div></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno giovedì 22 aprile 2021 alle 17:45:06 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Here's the log file. The job ended prematurely.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 3:23 PM Lenard Carroll <<a rel="nofollow">len...@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Not sure yet. The job is still in the queue. As soon as it is finished I'll post the log file info here.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 3:15 PM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">And it works? Check the output and the performance... It can be that your particular test case doesn't use the GPU at all, so could you attach the log (at least the final part of it)<br><br><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno giovedì 22 aprile 2021 alle 13:42:16 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr">I am using 30 threads now over 3 GPUs, so I used:</div><div dir="ltr"><br><div>export OMP_NUM_THREADS=10<br></div><div>mpiexec -n 3 cp2k.psmp -i gold50.inp -o gold50.out<br></div><div><br></div></div></div></div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 1:34 PM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Wait, I see you have 32 threads in total, so need to have 32/4 = 8 threads.<div>Please change</div><div><br></div><div>export OMP_NUM_THREADS=8<br><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno giovedì 22 aprile 2021 alle 13:27:59 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Shall do. I already set it up, but it's in a long queue.</div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 1:22 PM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Could you try what I suggested:<div><br></div><div><div>export OMP_NUM_THREADS=10</div><div>mpirun -np 4 ./cp2k.psmp -i gold.inp -o gold_pbc.out</div><br></div><div>Please check the corresponding log.</div><div><br></div><div>As I said above, you need an MPI rank per GPU and you told us that you have 4 GPUs, so you need 4 ranks (or multiple). With 10 you get unbalance.</div><div><br></div><div><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno giovedì 22 aprile 2021 alle 10:17:27 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">Correction, he told me to use:<div><br></div><div><div>mpirun -np 10 cp2k.psmp -i gold.inp -o gold_pbc.out</div><div><br></div></div><div>but it didn't run correctly.</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 9:51 AM Lenard Carroll <<a rel="nofollow">len...@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">He suggested I try out:<div><div>mpirun -n 10 cp2k.psmp -i gold.inp -o gold_pbc.out</div><div><br></div><div>as he is hoping that will cause the 1 GPU to use 10 CPUs over the selected 4 GPUs.</div><div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 22, 2021 at 9:48 AM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<div>Your command to run CP2K doesn't mention MPI (mpirun, mpiexc, ...). Are you running with multiple ranks?</div><div><br></div><div>You can check those lines in the output:</div><div><br></div><div><div> GLOBAL| Total number of message passing processes                            32</div><div> GLOBAL| Number of threads for this process                                    4</div></div><div><br></div><div>And check your numbers.</div><div>I can guess you have 1 rank and 40 threads.</div><div>To use 4 GPUs you need 4 ranks (and less threads per rank), i.e. something like</div><div><br></div><div>export OMP_NUM_THREADS=10</div><div>mpiexec -n 4 ./cp2k.psmp -i gold.inp -o gold_pbc.out</div><div><br></div><div>Please check with your sysadmin on how to run with multiple MPI ranks.</div><div><br></div><div>Hope it helps.</div><div><br></div><div>Alfio</div><div><br></div><div><br><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno mercoledì 21 aprile 2021 alle 09:26:53 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">This is what my PBS file looks like:<div><br></div><div><div>#!/bin/bash</div><div>#PBS -P <PROJECT></div><div>#PBS -N <JOBNAME></div><div>#PBS -l select=1:ncpus=40:ngpus=4</div><div>#PBS -l walltime=08:00:00</div><div>#PBS -q gpu_4</div><div>#PBS -m be</div><div>#PBS -M none</div><div><br></div><div>module purge</div><div><span style="background-color:transparent">module load chpc/cp2k/8.1.0/cuda10.1/openmpi-4.0.0/gcc-7.3.0</span><br></div><div><span style="background-color:transparent">source $SETUP</span><br></div><div><span style="background-color:transparent">cd $PBS_O_WORKDIR</span><br></div><div><br></div><div>cp2k.psmp -i gold.inp -o gold_pbc.out</div><div>~                                                                                                                       ~                                         </div></div></div></div><br><div class="gmail_quote"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 21, 2021 at 9:22 AM Alfio Lazzaro <<a rel="nofollow">al...@gmail.com</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The way to use 4 GPUs per node is to use 4 MPI ranks. How many ranks are you using?<br><br><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno martedì 20 aprile 2021 alle 19:44:15 UTC+2 ASSIDUO Network ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I'm asking, since the administrator running my country's HPC is saying that although I'm requesting access to 4 GPUs, CP2K is only using 1. I checked the following output:<div><div> DBCSR| ACC: Number of devices/node                                            4<br></div><div><br></div><div>And it shows that CP2K is picking up 4 GPUs.</div><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Tuesday, April 20, 2021 at 3:00:17 PM UTC+2 ASSIDUO Network wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I currently have access to 4 GPUs to run an AIMD simulation, but only one of the GPUs are being used. Is there a way to use the other 3, and if so, can you tell me how to set it up with a PBS job?</blockquote></div></blockquote></div>

<p></p></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/70ba0fce-8636-4b75-940d-133ce4dbf0can%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/70ba0fce-8636-4b75-940d-133ce4dbf0can%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595045000&usg=AFQjCNGiV53IgPXgkqaiz_vuv0vk0A9mhQ">https://groups.google.com/d/msgid/cp2k/70ba0fce-8636-4b75-940d-133ce4dbf0can%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/92e4f88d-fde8-4127-ab5f-0b98bbbba8ebn%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/92e4f88d-fde8-4127-ab5f-0b98bbbba8ebn%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595045000&usg=AFQjCNFfhlRXNaqi3GID2D9KQ5JLXYQiCQ">https://groups.google.com/d/msgid/cp2k/92e4f88d-fde8-4127-ab5f-0b98bbbba8ebn%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/59a635d8-0f0c-4dc5-abaf-b8bbe3c18da5n%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/59a635d8-0f0c-4dc5-abaf-b8bbe3c18da5n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595045000&usg=AFQjCNGz9_k3b8_OXZCtblVYUdWUrze77w">https://groups.google.com/d/msgid/cp2k/59a635d8-0f0c-4dc5-abaf-b8bbe3c18da5n%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/ec4efd81-6314-4ce7-b22c-148b362d2ba6n%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/ec4efd81-6314-4ce7-b22c-148b362d2ba6n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595045000&usg=AFQjCNES0B1fjkTcDzZZDGRPOXdkaFagLA">https://groups.google.com/d/msgid/cp2k/ec4efd81-6314-4ce7-b22c-148b362d2ba6n%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/d29306aa-e0b8-4797-9298-13dab23e9083n%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/d29306aa-e0b8-4797-9298-13dab23e9083n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595045000&usg=AFQjCNG9zR5W2_SHqEta52gTqzmO3n9DzQ">https://groups.google.com/d/msgid/cp2k/d29306aa-e0b8-4797-9298-13dab23e9083n%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a rel="nofollow">cp...@googlegroups.com</a>.<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/6852eb71-6886-4fe7-8f4a-3ad8318a289dn%40googlegroups.com?utm_medium=email&utm_source=footer" rel="nofollow" target="_blank" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/6852eb71-6886-4fe7-8f4a-3ad8318a289dn%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595046000&usg=AFQjCNHtNNePBl-gGzgo0ERRAyuEuY8opQ">https://groups.google.com/d/msgid/cp2k/6852eb71-6886-4fe7-8f4a-3ad8318a289dn%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href data-email-masked rel="nofollow">cp...@googlegroups.com</a>.<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/c2033277-5fdf-4e98-9329-e9a289a5b277n%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank" rel="nofollow" data-saferedirecturl="https://www.google.com/url?hl=it&q=https://groups.google.com/d/msgid/cp2k/c2033277-5fdf-4e98-9329-e9a289a5b277n%2540googlegroups.com?utm_medium%3Demail%26utm_source%3Dfooter&source=gmail&ust=1619252595046000&usg=AFQjCNFMOQX3ttwA_aD5Jy6O5Fps1lZdzg">https://groups.google.com/d/msgid/cp2k/c2033277-5fdf-4e98-9329-e9a289a5b277n%40googlegroups.com</a>.<br>
</blockquote></div>
</blockquote></div>