flags for reducing memory consumption for quickstep and quickstep QM/MM

Axel akoh... at gmail.com
Tue Sep 11 16:24:53 UTC 2007

hi rachel and everybody else who answered.

thanks a lot. it is great to see, that we seem
to finally get some sort of 'community' started here.
please keep it up. it is much appreciated.

sadly, most of the tricks mentioned, i was already using.

a couple more remarks:

On Sep 11, 8:02 am, toot <rachel... at rub.de> wrote:
> Architecture: x86-64

rachel, please provide the full description of the platform,
i.e. hardware, compiler(!), parallel interconnect (hard and software)
and libraries.

i (and teo and many others here) already learned the hard way, that
just the type of cpu is not enough to describe a platform and that
with cp2k due to its use of many 'newer' (new as in introduced less
than 30 years ago... :) ) fortran features, you are always running
the risk of being fooled by a bug in the compiler posing as a bug
in the code. it almost seems as if writing a fully correct _and_
well performing fortran 90/95 compiler is an impossible task, and
that compiler vendors test mostly against legacy (fortran 77 and
older) codes.

> Just compared virtual memory size, resident set size and used swap
> ("top") in each of the runs

i can confirm this on x86_64 using intel 10, OpenMPI, and MKL.
i tested with FIST. i noticed, however, that there are two entries
for ewald one in /FORCE_EVAL/DFT/POISSON/EWALD and one in
/FORCE_EVAL/MM/POISSON/EWALD and both claim to be applicable only
to classical atoms. it would be nice if somebody could clarify this.

out of the three EWALD options, SPME (which i have been using already)
seems to be the least memory hungry followed by plain EWALD and PME.

what strikes me odd, is that in the communication summary, there are
the exact same number of calls to the MP_xxx subroutines in both
i would have expected that in the distributed case, there is a
different communication pattern as with replicated. could it be, that
the flag is not correctly handed down? it appears in the restart
so i assume it is parsed ok.


> rachel
> On 11 Sep., 12:21, Teodoro Laino <teodor... at gmail.com> wrote:
> > Just curiosity..
> > How did you check the memory in parallel runs? on which architecture?
> > teo
> > On 11 Sep 2007, at 11:48, toot wrote:
> > > Toot toot everybody,
> > > i tried RS_GRID DISTRIBUTED for all the grids i've got and doesn't
> > > make a blind bit of difference (to either memory or energies)!
> > > cheers
> > > Rachel
> > > On Sep 11, 11:27 am, tkuehne <tku... at gmail.com> wrote:
> > >> Hi Axel
> > >> Have you already tried RS_GRID DISTRIBUTED?
> > >> As far as I remember it once reproduces exactly the same numbers, at
> > >> least using GPW.
> > >> Best regards,
> > >> Thomas
> > >> On Sep 10, 7:32 pm, Axel <akoh... at gmail.com> wrote:
> > >>> hi all,
> > >>> are there any recommendations on how to reduce the memory
> > >>> consumption
> > >>> of
> > >>> cp2k calculations, particularly using QS/GPW and QM/MM with QS/GPW
> > >>> while maintaining a given level of accuracy?
> > >>> i'm currently having a couple of inputs that would _almost_ fit
> > >>> into a
> > >>> machine with
> > >>> multi-core processors using all cores. right now i have to run using
> > >>> half the cores
> > >>> and thus waste a lot of cpu time allocation...
> > >>> cheers,
> > >>>    axel.

More information about the CP2K-user mailing list