[CP2K-user] [CP2K:19974] Built cuda.ssmp version successfully but failed on regtest
Krack Matthias
matthias.krack at psi.ch
Mon Feb 26 13:06:42 UTC 2024
Hi
This looks like an issue with the CUDA installation. You load the module cuda/11.7.1, whereas nvidia-smi returns “CUDA Version 12.2”. Maybe, that’s something to check.
Best
Matthias
From: cp2k at googlegroups.com <cp2k at googlegroups.com> on behalf of Mike Chen <mike.scchen at gmail.com>
Date: Sunday, 25 February 2024 at 14:03
To: cp2k <cp2k at googlegroups.com>
Subject: [CP2K:19967] Built cuda.ssmp version successfully but failed on regtest
Hi all,
I'm trying to build cuda_ssmp version of CP2K 2024.1, without the MPI support (not really necessary since the machine has only one GPU card?).
The build process goes well, but the regtest failed as:
[root at gpu01 cp2k-2024.1]# make ARCH=local_cuda VERSION=ssmp test
(......)
========= Python (ssmp) =========
/usr/bin/env python3 --version
Python 3.6.8
----------------------- External Modules ---------------------------------
DBCSR Version: 2.6.0 (2023-07-10)
---------------------------- Modules -------------------------------------
Currently Loaded Modulefiles:
1) gcc-9.5.0/gcc 2) cuda/11.7.1 3) tools/cmake-3.28.1 4) tools/git-2.43.0
*************************** Testing started ****************************
ERROR: cuInit failed with error: 999 /cluster/bld/cp2k-2024.1/src/offload/offload_library.c 57
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7f8837d243ff in ???
#1 0x7f8837d24387 in ???
#2 0x7f8837d25a77 in ???
#3 0x30fe53b in offload_init
at /cluster/bld/cp2k-2024.1/src/offload/offload_library.c:58
#4 0xaf6e0b in __f77_interface_MOD_init_cp2k
at /cluster/bld/cp2k-2024.1/src/f77_interface.F:234
#5 0x455f88 in cp2k
at /cluster/bld/cp2k-2024.1/src/start/cp2k.F:284
#6 0x40ebdc in main
at /cluster/bld/cp2k-2024.1/src/start/cp2k.F:44
Could not parse feature flags.
The machine has a RTX A6000 card, and CUDA toolkit 12.3.2 was used:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A6000 Off | 00000000:3B:00.0 Off | Off |
| 30% 36C P0 67W / 300W | 2MiB / 49140MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+
The OS is CentOS 7.9, and I used a source-built GCC 9.5.0 as loadable module.
The toolchain was built with:
./install_cp2k_toolchain.sh --mpi-mode=no --with-cmake=system -j 16 --enable-cuda=yes --gpu-ver=A100
and then in the CP2K source folder:
make -j 16 ARCH=local_cuda VERSION=ssmp
However, the CPU ssmp version (ARCH=local) was built and can pass all the regtests successfully.
Any suggestions on making the CUDA version works?
Mike
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com<mailto:cp2k+unsubscribe at googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/affc7786-0b86-4e51-8d40-22247d0d2ab9n%40googlegroups.com<https://groups.google.com/d/msgid/cp2k/affc7786-0b86-4e51-8d40-22247d0d2ab9n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/ZRAP278MB08270DCEE2C5AF6F1C0A5AE5F45A2%40ZRAP278MB0827.CHEP278.PROD.OUTLOOK.COM.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20240226/85dff352/attachment.htm>
More information about the CP2K-user
mailing list