New Release of LIBXSMM
Hans Pabst
hf.p... at gmail.com
Wed Oct 26 10:40:29 UTC 2016
*LIBXSMM 1.5.1 has been released*, which is mainly a bug-fix release
gaining its urgency from a fixed Fortran interface (SMM functionality) The
issue is applicable to CP2K in particular (Fortran), where requesting a JIT
kernel never returned a suitable PROCEDURE POINTER (always NULL):
https://github.com/hfp/libxsmm/releases/tag/1.5.1
<https://github.com/hfp/libxsmm/releases/tag/1.5>.
Am Mittwoch, 5. Oktober 2016 15:01:25 UTC+2 schrieb Hans Pabst:
>
> *LIBXSMM 1.5 has been released*. You can read more about what's *INTRODUCED
> *(also see below), the *CHANGES *in general, and what has been *FIXED*:
> https://github.com/hfp/libxsmm/releases/tag/1.5.
>
> The library was carefully validated, the SMM core functionality received
> fixes for issues which are not exposed by CP2K (but are present in previous
> releases of LIBXSMM). The validation was against a variety of applications;
> most relevant here are CP2K's regression tests. These tests have been even
> stronger by using LIBXSMM's linker wrapper
> <https://github.com/hfp/libxsmm#call-wrapper> to pass *all *GEMM calls
> through the LIBXSMM library including calls that are made by other
> libraries such as LAPACK. Most notable for the Supercomputing Centers might
> be support for the CRAY Compiling Environment (CCE), but also the support
> for PGI's compiler. Please note, the JIT code generation under Microsoft
> Windows is still pending, and due to missing support for the calling
> convention this applies equally to Cygwin.
>
> *INTRODUCED*
>
> - New DNN API, sample code, and benchmarks (Googlenetv1, DeepBench,
> and Overfeat)
> - Enabled tiled GEMM support in static/dynamic wrapper; MT support via
> libxsmmext
> - More format variations of sparse matrix multiplication (dense/sparse
> etc.)
> - Sample code showing sparse matrix multiplication (PyFR examples
> collection)
> - Published synchronization layer (atomics, and simple/bare
> OS-thread/lock abstraction)
> - Introduced mini-API for optimized barrier implementation (general
> multicore support)
> - Introduced API for memory allocation (malloc interface); mostly
> exposed from internal API
> - Beside of Intel VTune, now Linux perf and jitdump are supported
> (Thank you Maciej D.!)
> - SPECFEM sample: received nicely written example contribution (Thank
> you Daniel P.!)
> - OSX (incl. "El Capitan") now supports Intel Compiler, Apple/Clang,
> and GNU GCC
> - CRAY's Compiling Environment (CCE) is now supported
> - PGI compiler is now supported
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20161026/2865f2b8/attachment.htm>
More information about the CP2K-user
mailing list