New Release of LIBXSMM

Hans Pabst hf.p... at gmail.com
Wed Oct 26 10:40:29 UTC 2016


*LIBXSMM 1.5.1 has been released*, which is mainly a bug-fix release 
gaining its urgency from a fixed Fortran interface (SMM functionality) The 
issue is applicable to CP2K in particular (Fortran), where requesting a JIT 
kernel never returned a suitable PROCEDURE POINTER (always NULL):
https://github.com/hfp/libxsmm/releases/tag/1.5.1 
<https://github.com/hfp/libxsmm/releases/tag/1.5>.


Am Mittwoch, 5. Oktober 2016 15:01:25 UTC+2 schrieb Hans Pabst:
>
> *LIBXSMM 1.5 has been released*. You can read more about what's *INTRODUCED 
> *(also see below), the *CHANGES *in general, and what has been *FIXED*: 
> https://github.com/hfp/libxsmm/releases/tag/1.5.
>
> The library was carefully validated, the SMM core functionality received 
> fixes for issues which are not exposed by CP2K (but are present in previous 
> releases of LIBXSMM). The validation was against a variety of applications; 
> most relevant here are CP2K's regression tests. These tests have been even 
> stronger by using LIBXSMM's linker wrapper 
> <https://github.com/hfp/libxsmm#call-wrapper> to pass *all *GEMM calls 
> through the LIBXSMM library including calls that are made by other 
> libraries such as LAPACK. Most notable for the Supercomputing Centers might 
> be support for the CRAY Compiling Environment (CCE), but also the support 
> for PGI's compiler. Please note, the JIT code generation under Microsoft 
> Windows is still pending, and due to missing support for the calling 
> convention this applies equally to Cygwin.
>
> *INTRODUCED*
>
>    - New DNN API, sample code, and benchmarks (Googlenetv1, DeepBench, 
>    and Overfeat)
>    - Enabled tiled GEMM support in static/dynamic wrapper; MT support via 
>    libxsmmext
>    - More format variations of sparse matrix multiplication (dense/sparse 
>    etc.)
>    - Sample code showing sparse matrix multiplication (PyFR examples 
>    collection)
>    - Published synchronization layer (atomics, and simple/bare 
>    OS-thread/lock abstraction)
>    - Introduced mini-API for optimized barrier implementation (general 
>    multicore support)
>    - Introduced API for memory allocation (malloc interface); mostly 
>    exposed from internal API
>    - Beside of Intel VTune, now Linux perf and jitdump are supported 
>    (Thank you Maciej D.!)
>    - SPECFEM sample: received nicely written example contribution (Thank 
>    you Daniel P.!)
>    - OSX (incl. "El Capitan") now supports Intel Compiler, Apple/Clang, 
>    and GNU GCC
>    - CRAY's Compiling Environment (CCE) is now supported
>    - PGI compiler is now supported
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20161026/2865f2b8/attachment.htm>


More information about the CP2K-user mailing list