flags for reducing memory consumption for quickstep and quickstep QM/MM

toot rachel... at rub.de
Wed Sep 12 10:54:23 UTC 2007


sorry axel-i'm new to this "forum" game;-D

here goes:

Opteron 275 cpus, 4G RAM, Tyan board (52892);
Melanox MHES18-XSC Infiniband card, Flextronics infiniband switch,
OFED-1.1 software;
intel fortran 9.0; mpirun 1.1.2
my job ran on 12 nodes, but each with only 1 cpu, cos otherwise it
swapped like mad and crashed after a while,
which was rather annoying, for me anyway, the computer probably didn't
care.

I tried out all 3 grids, none of them seemed to make a difference (as
far as i can tell anyway...)



On 11 Sep., 18:24, Axel <akoh... at gmail.com> wrote:
> hi rachel and everybody else who answered.
>
> thanks a lot. it is great to see, that we seem
> to finally get some sort of 'community' started here.
> please keep it up. it is much appreciated.
>
> sadly, most of the tricks mentioned, i was already using.
> :-(
>
> a couple more remarks:
>
> On Sep 11, 8:02 am, toot <rachel... at rub.de> wrote:
>
> > Architecture: x86-64
>
> rachel, please provide the full description of the platform,
> i.e. hardware, compiler(!), parallel interconnect (hard and software)
> and libraries.
>
> i (and teo and many others here) already learned the hard way, that
> just the type of cpu is not enough to describe a platform and that
> with cp2k due to its use of many 'newer' (new as in introduced less
> than 30 years ago... :) ) fortran features, you are always running
> the risk of being fooled by a bug in the compiler posing as a bug
> in the code. it almost seems as if writing a fully correct _and_
> well performing fortran 90/95 compiler is an impossible task, and
> that compiler vendors test mostly against legacy (fortran 77 and
> older) codes.
>
> > Just compared virtual memory size, resident set size and used swap
> > ("top") in each of the runs
>
> i can confirm this on x86_64 using intel 10, OpenMPI, and MKL.
> i tested with FIST. i noticed, however, that there are two entries
> for ewald one in /FORCE_EVAL/DFT/POISSON/EWALD and one in
> /FORCE_EVAL/MM/POISSON/EWALD and both claim to be applicable only
> to classical atoms. it would be nice if somebody could clarify this.
>
> out of the three EWALD options, SPME (which i have been using already)
> seems to be the least memory hungry followed by plain EWALD and PME.
>
> what strikes me odd, is that in the communication summary, there are
> the exact same number of calls to the MP_xxx subroutines in both
> cases.
> i would have expected that in the distributed case, there is a
> (slightly?)
> different communication pattern as with replicated. could it be, that
> the flag is not correctly handed down? it appears in the restart
> files,
> so i assume it is parsed ok.
>
> cheers,
>    axel.
>
>
>
> > rachel
>
> > On 11 Sep., 12:21, Teodoro Laino <teodor... at gmail.com> wrote:
>
> > > Just curiosity..
> > > How did you check the memory in parallel runs? on which architecture?
>
> > > teo
>
> > > On 11 Sep 2007, at 11:48, toot wrote:
>
> > > > Toot toot everybody,
>
> > > > i tried RS_GRID DISTRIBUTED for all the grids i've got and doesn't
> > > > make a blind bit of difference (to either memory or energies)!
>
> > > > cheers
>
> > > > Rachel
>
> > > > On Sep 11, 11:27 am, tkuehne <tku... at gmail.com> wrote:
> > > >> Hi Axel
>
> > > >> Have you already tried RS_GRID DISTRIBUTED?
> > > >> As far as I remember it once reproduces exactly the same numbers, at
> > > >> least using GPW.
>
> > > >> Best regards,
> > > >> Thomas
>
> > > >> On Sep 10, 7:32 pm, Axel <akoh... at gmail.com> wrote:
>
> > > >>> hi all,
>
> > > >>> are there any recommendations on how to reduce the memory
> > > >>> consumption
> > > >>> of
> > > >>> cp2k calculations, particularly using QS/GPW and QM/MM with QS/GPW
> > > >>> while maintaining a given level of accuracy?
>
> > > >>> i'm currently having a couple of inputs that would _almost_ fit
> > > >>> into a
> > > >>> machine with
> > > >>> multi-core processors using all cores. right now i have to run using
> > > >>> half the cores
> > > >>> and thus waste a lot of cpu time allocation...
>
> > > >>> cheers,
> > > >>>    axel.




More information about the CP2K-user mailing list