[CP2K:199] Re: CRAY-XT3 - PGI and PathScale
teodor... at gmail.com
Tue Jul 17 06:06:09 UTC 2007
Very good suggestions.. Honestly when I compile with PGI I only use -
I didn't change the compiler flags just because it was not me to
create the original
arch file... and these kind of changes can be done locally by
I forgot to mention that in the previous e-mail..
Great! at least we have now a fixed reference here in this blog ;-))
On 17 Jul 2007, at 06:41, Axel wrote:
> hi teo,
> a few remarks to those arch files.
> - i would replace: -O3 -Mscalarsse -Mvect=sse
> with: -O2 -Munroll
> in my experience, for almost all codes that use
> a lot of cosine/sine/exponential/power/... SSE
> is counterproductive, as the time to switch between
> the regular floating point unit and the SSE unit does
> not always offset the gain of using SSE, particularly
> in double precision. SSE is most useful for plain linear
> algebra, but for that we have BLAS and (Sca)LAPACK...
> loop unrolling on the other hand helps a lot and in many
> cases -O3 optimizes to aggressively. this is more visible
> on intel cpus compared to amd cpus but still...
> - how about using pgf90 instead of ftn for the serial compile?
> this way you'd get a serial executable that can actually
> run on the frontend (and does not segfault).
> p.s.: i know the flags you use are the ones suggested by cray,
> but all the examples they show to illustrate those flags, fall under
> the plain linear algebra case...
> On Jul 16, 3:39 pm, Teodoro Laino <teodor... at gmail.com> wrote:
>> I updated today a list of files that require an O0 optimization
>> level, due to bugs in the portland compiler, in order
>> to run cp2k on the full suite of regtests.
>> The new arch files (CRAY-XT3.popt and CRAY-XT3.sopt) contain the full
>> list of those subroutines..
>> The produced executables (both with PGI and PatchScale) can run
>> without crash the full set of regtests.
More information about the CP2K-user