[CP2K-user] [CP2K:22092] Re: GPU vs CPU performance on consumer workstation

Frederick Stein f.stein at hzdr.de
Sat Feb 7 20:20:19 UTC 2026


Dear Rafael,
with your GPU consumer cards will not provide an acceleration in case of 
CP2K no matter the workload because CP2K relies on Double-precision 
floating point numbers for accuracy which are not well supported by 
consumer cards such as NVIDIA RTX.
The GPU performance has improved since then (grid library, PDGEMM in RPA, 
DGEMM in MP2, ...) so some comments in the linked are not anymore correct.
I can't tell how much memory (CPU or GPU) you need for this test.
If you are interested to use the latest version of CP2K, be aware that you 
need to switch to the CMake-based (or Spack or Easybuild) build system.
Best,
Frederick

rafa... at gmail.com schrieb am Samstag, 7. Februar 2026 um 19:31:35 UTC+1:

> Hello, I'm testing CP2K performance on an older workstation PC and I'm 
> finding that a the CPU version of CP2k 2025.2 is faster than the GPU 
> version. My understanding is that many consumer GPUs do not have great 
> double precision performance, but I can't tell if the slower GPU timing is 
> normal for my system or if there is anything I can improve? For example, a 
> CPU-only H2O-32.inp benchmark is twice as fast as a GPU run. The timings 
> show that "grid_collocate_task_list" and "grid_integrate_task_list" are the 
> most time consuming steps.
>
> I came across a similar thread from 2018 issue73 
> <https://github.com/cp2k/cp2k/issues/73>, but I wonder how those comments 
> hold up for the 2025.2 CP2K version? Should I expect any performance gains 
> from a GPU on small systems (<250 atoms)? I attached the ARCH files I used 
> to build the CPU and GPU versions of CP2K along with the output files from 
> the H2O-32.inp benchmarks.
>
> My system has: hyperthreaded 4-core AMD Ryzen 5 2400G CPU, NVIDIA RTX 3050 
> 6gb GPU, and 16gb RAM.
>
> For CPU runs I use 4 MPI ranks with 2 OMP threads to get full CPU 
> utilization. For GPU runs I use 1 MPI rank with 2 OMP threads, increasing 
> OMP_NUM_THREADS to 4, 6, 8 does not show increased CPU utilization during a 
> GPU run.
>
> (I am unable to run H20-64.inp on GPU because of a CUDA OOM error: ERROR: 
> "cudaErrorLaunchOutOfResources" at 
> /home/raf/cp2k-home/cp2k-colordiffusion/cp2k-2025.2/src/grid/gpu/
> grid_gpu_collocate.cu:387 )
>
> Thanks,
> Rafal
>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/9900c881-2d95-46aa-842e-ca5491ae8718n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20260207/7710e7de/attachment.htm>


More information about the CP2K-user mailing list