[CP2K-user] [CP2K:17965] Re: cudaErrorStubLibrary runtime error in CP2K
Ole Schütt
ole.schuett at cp2k.org
Mon Oct 31 08:55:13 UTC 2022
Hi Amit,
if you are getting cudaErrorStubLibrary then the app failed to load the
real driver. Presumably your local admin can help with this. See also the
CUDA docs
<https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038>
:
cudaErrorStubLibrary = 34 This indicates that the CUDA driver that the
application has loaded is a stub library. Applications that run with the
stub rather than a real driver loaded will result in CUDA API returning
this error.
-Ole
On Sunday, October 30, 2022 at 9:59:11 PM UTC+1 Amit Gupta wrote:
> Hi,
> I compiled a CUDA compatible version of CP2K, but when I submit my jobs, I
> get the following error:
> --------------------------------------------------------------------------
> ERROR: cudaErrorStubLibrary
> /scratch/ag9288/softwares/cp2k-2022.2/src/offload/offload_library.c 49
> forrtl: error (76): Abort trap signal
> Image PC Routine Line Source
>
> cp2k.psmp 0000000006B83CAB Unknown Unknown Unknown
> libpthread-2.28.s 000014D73245EB20 Unknown Unknown Unknown
> libc-2.28.so 000014D712D4037F gsignal Unknown
> Unknown
> libc-2.28.so 000014D712D2ADB5 abort Unknown
> Unknown
> cp2k.psmp 00000000031F8C29 offload_get_devic 49
> offload_library.c
> cp2k.psmp 0000000001387F46 f77_interface_mp_ 276
> f77_interface.F
> cp2k.psmp 000000000042281D MAIN__ 284 cp2k.F
> cp2k.psmp 0000000006BED096 Unknown Unknown Unknown
> libc-2.28.so 000014D712D2C493 __libc_start_main Unknown
> Unknown
> cp2k.psmp 000000000042195E Unknown Unknown Unknown
> --------------------------------------------------------------------------
>
> It looks like the executable is unable to find the card, while all the
> libraries are correctly loaded.
>
> Version:
> Intel fortran: 19.1
> cuda : 11.3
> openmpi 4.1.1
> CP2K: 2022.2
>
> Steps I took to build the binary:
> 1. Install the toolchain:
> ./install_cp2k_toolchain.sh --with-intel=system --enable-cuda
> --gpu-ver=V100 --with-openmpi=system --with-libxc=install
> --with-libint=install --with-fftw=install --with-mkl=system
> --with-libxsmm=install --with-elpa=install --with-ptscotch=no
> --with-superlu=no --with-pexsi=no --with-quip=no
> --with-plumed=no --with-sirius=no --with-gsl=no
> --with-libvdwxc=no --with-spglib=no --with-hdf5=no --with-spfft=no
> --with-spla=install --with-cosma=install --with-libvori=no
> --with-openblas=no -j4
>
> 2. Manualy build and copy Libint, for the fortran makefile bug:
> https://github.com/cp2k/libint-cp2k/releases
> cmake -DCMAKE_C_COMPILER=icc -DCMAKE_C_FLAGS="-qopenmp -O2"
> -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS="-qopenmp -O2"
> -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_Fortran_FLAGS="-qopenmp -O2"
> -DREQUIRE_CXX_API=OFF -DENABLE_FORTRAN=ON ..
>
> 3. Apply required CUDA patch to CP2K and compile:
> https://github.com/cp2k/cp2k/commit/ee6c3aa.patch
> make -j 4 ARCH=local_cuda VERSION="ssmp sdbg psmp pdbg"
>
> PS: The job-node is different from the compile node, could that might be
> an issue? Also while the compilation went fine, there was an error in
> symlinking the binaries.
>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/49ebddbe-ec34-4f10-b1a9-d91514836765n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20221031/9994d0c1/attachment-0001.htm>
More information about the CP2K-user
mailing list