[CP2K:777] Re: parallel distribution of data

Nichols A. Romero naro... at gmail.com
Tue Mar 11 14:03:53 UTC 2008


Axel,

Is this related to OPENMPI gobbling up tons of memory?


On 3/10/08, Axel <akoh... at gmail.com> wrote:
>
>
>
>
> On Mar 10, 5:49 pm, "Nichols A. Romero" <naro... at gmail.com> wrote:
> > Teo,
> >
> > I was just able to reproduce this on on another machine
> .http://www.mhpcc.hpc.mil/doc/jaws.html
> >
> > I just ran it on 256 processors. Compiled it with ifort 9.1.045 and
> mvapich
> > 1.2.7.
> > I attach the arch file.
>
> nick,
>
> here's another caveat which has most likely nothing to do
> with the immediate error that you are seeing, but may bite
> you later.
>
> when running on large infiniband clusters, you may have to limit
> the number of processes per node. for the way openfabrics seems
> to work (at least at the moment) you need _physical_ memory
> as "backing store" for each RDMA connection, i.e. for each MPI
> task you'll lose some physical memory regardless of the memory
> requirements of your jobs. i've seen this on the NCSA 'abe' cluster
> where i ran out of memory for rather small jobs despite having 1GB/
> core
> simply by increasing the requested number of cpus. also, you
> may get better performance by using half the cpu cores requested.
> i had to go down to a quarter (abe is dual quad-core, though) for
> really big jobs. :-(
>
> cheers,
>    axel.
>
>
>
>
> >
> > Here is the error that I am seeing.
> >
> > Out of memory ...
> >
> >  *
> >  *** ERROR in get_my_tasks  ***
> >  *
> >
> >  *** The memory allocation for the data object <send_buf_r> failed.
> The  ***
> >  *** requested memory size is 1931215 Kbytes
> ***
> >
>
> [...]
>
>
> > --
> > Nichols A. Romero, Ph.D.
> > DoD User Productivity Enhancement and Technology Transfer (PET) Group
> > High Performance Technologies, Inc.
> > Reston, VA
> > 443-567-8328 (C)
> > 410-278-2692 (O)
> >
> >  Linux-x86-64-intel.popt
> > 1KDownload
> >
>


-- 
Nichols A. Romero, Ph.D.
DoD User Productivity Enhancement and Technology Transfer (PET) Group
High Performance Technologies, Inc.
Reston, VA
443-567-8328 (C)
410-278-2692 (O)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20080311/c6363bea/attachment.htm>


More information about the CP2K-user mailing list