[CP2K-user] [CP2K:17967] Re: cudaErrorStubLibrary runtime error in CP2K
Amit Gupta
amit.welcomes.u at gmail.com
Mon Oct 31 15:31:27 UTC 2022
I resolved the above error, by symlinking to libcudart as libcuds.so.1. So
I guess this issue is closed.
I ran in secondary issue of `cudaErrorInsufficientDriver` so I guess there
is something wrong with my loaded CUDA libraries, So I will probably
recompile it again with more carefully loaded latest driver.
Thank you
On Monday, October 31, 2022 at 10:14:26 AM UTC-5 Amit Gupta wrote:
> Could it be a compile time issue on my end? I remember toolchain requiring
> libcuda for linking (cosma I believe), and I found issues on different
> repositories on github, mentioning that libcuda resides in
> lib/stubs/libcuda.so. So I linked against it, provide its location at
> runtime in LD_LIBRARY_PATH.
>
> On Monday, October 31, 2022 at 3:55:13 AM UTC-5 Ole Schütt wrote:
>
>> Hi Amit,
>>
>> if you are getting cudaErrorStubLibrary then the app failed to load the
>> real driver. Presumably your local admin can help with this. See also the
>> CUDA docs
>> <https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038>
>> :
>>
>> cudaErrorStubLibrary = 34 This indicates that the CUDA driver that
>> the application has loaded is a stub library. Applications that run with
>> the stub rather than a real driver loaded will result in CUDA API returning
>> this error.
>>
>> -Ole
>>
>> On Sunday, October 30, 2022 at 9:59:11 PM UTC+1 Amit Gupta wrote:
>>
>>> Hi,
>>> I compiled a CUDA compatible version of CP2K, but when I submit my jobs,
>>> I get the following error:
>>>
>>> --------------------------------------------------------------------------
>>> ERROR: cudaErrorStubLibrary
>>> /scratch/ag9288/softwares/cp2k-2022.2/src/offload/offload_library.c 49
>>> forrtl: error (76): Abort trap signal
>>> Image PC Routine Line
>>> Source
>>> cp2k.psmp 0000000006B83CAB Unknown Unknown
>>> Unknown
>>> libpthread-2.28.s 000014D73245EB20 Unknown Unknown
>>> Unknown
>>> libc-2.28.so 000014D712D4037F gsignal Unknown
>>> Unknown
>>> libc-2.28.so 000014D712D2ADB5 abort Unknown
>>> Unknown
>>> cp2k.psmp 00000000031F8C29 offload_get_devic 49
>>> offload_library.c
>>> cp2k.psmp 0000000001387F46 f77_interface_mp_ 276
>>> f77_interface.F
>>> cp2k.psmp 000000000042281D MAIN__ 284
>>> cp2k.F
>>> cp2k.psmp 0000000006BED096 Unknown Unknown
>>> Unknown
>>> libc-2.28.so 000014D712D2C493 __libc_start_main Unknown
>>> Unknown
>>> cp2k.psmp 000000000042195E Unknown Unknown
>>> Unknown
>>>
>>> --------------------------------------------------------------------------
>>>
>>> It looks like the executable is unable to find the card, while all the
>>> libraries are correctly loaded.
>>>
>>> Version:
>>> Intel fortran: 19.1
>>> cuda : 11.3
>>> openmpi 4.1.1
>>> CP2K: 2022.2
>>>
>>> Steps I took to build the binary:
>>> 1. Install the toolchain:
>>> ./install_cp2k_toolchain.sh --with-intel=system --enable-cuda
>>> --gpu-ver=V100 --with-openmpi=system --with-libxc=install
>>> --with-libint=install --with-fftw=install --with-mkl=system
>>> --with-libxsmm=install --with-elpa=install --with-ptscotch=no
>>> --with-superlu=no --with-pexsi=no --with-quip=no
>>> --with-plumed=no --with-sirius=no --with-gsl=no
>>> --with-libvdwxc=no --with-spglib=no --with-hdf5=no --with-spfft=no
>>> --with-spla=install --with-cosma=install --with-libvori=no
>>> --with-openblas=no -j4
>>>
>>> 2. Manualy build and copy Libint, for the fortran makefile bug:
>>> https://github.com/cp2k/libint-cp2k/releases
>>> cmake -DCMAKE_C_COMPILER=icc -DCMAKE_C_FLAGS="-qopenmp -O2"
>>> -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS="-qopenmp -O2"
>>> -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_Fortran_FLAGS="-qopenmp -O2"
>>> -DREQUIRE_CXX_API=OFF -DENABLE_FORTRAN=ON ..
>>>
>>> 3. Apply required CUDA patch to CP2K and compile:
>>> https://github.com/cp2k/cp2k/commit/ee6c3aa.patch
>>> make -j 4 ARCH=local_cuda VERSION="ssmp sdbg psmp pdbg"
>>>
>>> PS: The job-node is different from the compile node, could that might be
>>> an issue? Also while the compilation went fine, there was an error in
>>> symlinking the binaries.
>>>
>>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/658a9011-fda3-4b70-a30d-9d3dd4ef2307n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20221031/317f417b/attachment-0001.htm>
More information about the CP2K-user
mailing list