I'm not sure I understand your request. ACC is libcusmm (no libsmm), i.e. the CUDA version, which calls cublas for large kernels. For small kernels (80x80 matrix size), libcusmm outperforms cuBLAS. Please try to elaborate more on what you want to achieve.<div><br></div><div>Best regards,</div><div><br></div><div>Alfio<br><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">Il giorno mercoledì 24 novembre 2021 alle 09:05:42 UTC+1 qiy...@gmail.com ha scritto:<br/></div><blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div>Hi, CP2K developers and users,</div> I have successfully compiled the GPU version of CP2K, The DBCSR STATISTICS shows that all of the matmuls are calculated through ACC, which represents for libsmm, but I want to test the performance of cuBLAS. Anyone help?<div><br></div><div>Best Wishes</div><div>Qiyu8</div></blockquote></div> <p></p> -- <br /> You received this message because you are subscribed to the Google Groups "cp2k" group.<br /> To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br /> To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/73c66e3c-624e-42fb-a26d-6ce8ba91ae53n%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/cp2k/73c66e3c-624e-42fb-a26d-6ce8ba91ae53n%40googlegroups.com</a>.<br />