<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="en-CH" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span lang="DE-CH" style="font-size:11.0pt;mso-fareast-language:EN-US">Hi
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE-CH" style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US">This looks like an issue with the CUDA installation. You load the module cuda/11.7.1, whereas nvidia-smi returns “CUDA Version 12.2”. Maybe, that’s something to check.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US">Best<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US">Matthias<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:36.0pt">
<b><span style="font-size:12.0pt;font-family:"Aptos",sans-serif;color:black">From:
</span></b><span style="font-size:12.0pt;font-family:"Aptos",sans-serif;color:black">cp2k@googlegroups.com <cp2k@googlegroups.com> on behalf of Mike Chen <mike.scchen@gmail.com><br>
<b>Date: </b>Sunday, 25 February 2024 at 14:03<br>
<b>To: </b>cp2k <cp2k@googlegroups.com><br>
<b>Subject: </b>[CP2K:19967] Built cuda.ssmp version successfully but failed on regtest<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">Hi all,<br>
I'm trying to build cuda_ssmp version of CP2K 2024.1, without the MPI support (not really necessary since the machine has only one GPU card?).<br>
The build process goes well, but the regtest failed as:<br>
<br>
</span><span style="font-size:11.0pt;font-family:"Courier New"">[root@gpu01 cp2k-2024.1]# make ARCH=local_cuda VERSION=ssmp test<br>
(......)<br>
========= Python (ssmp) =========<br>
/usr/bin/env python3 --version<br>
Python 3.6.8<br>
----------------------- External Modules ---------------------------------<br>
DBCSR Version: 2.6.0 (2023-07-10)<br>
---------------------------- Modules -------------------------------------<br>
Currently Loaded Modulefiles:<br>
1) gcc-9.5.0/gcc 2) cuda/11.7.1 3) tools/cmake-3.28.1 4) tools/git-2.43.0<br>
*************************** Testing started ****************************<br>
ERROR: cuInit failed with error: 999 /cluster/bld/cp2k-2024.1/src/offload/offload_library.c 57<br>
Program received signal SIGABRT: Process abort signal.<br>
Backtrace for this error:<br>
#0 0x7f8837d243ff in ???<br>
#1 0x7f8837d24387 in ???<br>
#2 0x7f8837d25a77 in ???<br>
#3 0x30fe53b in offload_init<br>
at /cluster/bld/cp2k-2024.1/src/offload/offload_library.c:58<br>
#4 0xaf6e0b in __f77_interface_MOD_init_cp2k<br>
at /cluster/bld/cp2k-2024.1/src/f77_interface.F:234<br>
#5 0x455f88 in cp2k<br>
at /cluster/bld/cp2k-2024.1/src/start/cp2k.F:284<br>
#6 0x40ebdc in main<br>
at /cluster/bld/cp2k-2024.1/src/start/cp2k.F:44<br>
Could not parse feature flags.</span><span style="font-size:11.0pt"><br>
<br>
The machine has a RTX A6000 card, and CUDA toolkit 12.3.2 was used:<br>
</span><span style="font-size:11.0pt;font-family:"Courier New"">+---------------------------------------------------------------------------------------+<br>
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |<br>
|-----------------------------------------+----------------------+----------------------+<br>
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |<br>
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |<br>
| | | MIG M. |<br>
|=========================================+======================+======================|<br>
| 0 NVIDIA RTX A6000 Off | 00000000:3B:00.0 Off | Off |<br>
| 30% 36C P0 67W / 300W | 2MiB / 49140MiB | 2% Default |<br>
| | | N/A |<br>
+-----------------------------------------+----------------------+</span><span style="font-size:11.0pt"><br>
<br>
The OS is CentOS 7.9, and I used a source-built GCC 9.5.0 as loadable module.<br>
The toolchain was built with:<br>
<br>
</span><span style="font-size:11.0pt;font-family:"Courier New"">./install_cp2k_toolchain.sh --mpi-mode=no --with-cmake=system -j 16 --enable-cuda=yes --gpu-ver=A100</span><span style="font-size:11.0pt"><br>
<br>
and then in the CP2K source folder:<br>
<br>
</span><span style="font-size:11.0pt;font-family:"Courier New"">make -j 16 ARCH=local_cuda VERSION=ssmp</span><span style="font-size:11.0pt"><br>
<br>
However, the CPU ssmp version (ARCH=local) was built and can pass all the regtests successfully.<br>
Any suggestions on making the CUDA version works?<br>
<br>
Mike<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt">--
<br>
You received this message because you are subscribed to the Google Groups "cp2k" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/affc7786-0b86-4e51-8d40-22247d0d2ab9n%40googlegroups.com?utm_medium=email&utm_source=footer">
https://groups.google.com/d/msgid/cp2k/affc7786-0b86-4e51-8d40-22247d0d2ab9n%40googlegroups.com</a>.<o:p></o:p></span></p>
</div>
</div>
</div>
</body>
</html>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups "cp2k" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/ZRAP278MB08270DCEE2C5AF6F1C0A5AE5F45A2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/cp2k/ZRAP278MB08270DCEE2C5AF6F1C0A5AE5F45A2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM</a>.<br />