Request suggestions for cray xe6

Axel akoh... at gmail.com
Thu Apr 26 16:36:53 UTC 2012



On Thursday, April 26, 2012 5:30:23 AM UTC-4, c.pignedoli wrote:
>
> Dear all, 
> I have the impression I am missing many recent improvements of the code, 
> I was not able to follow them. 
>
> I usually run  standard DFT calculations for bulk/surface systems with 
> 700-1500 atoms (10 to 18 electrons each) 
> on a Cray Xe6 (AMD Interlagos 2 x 16-core 64-bit CPUs, 32 GB per compute 
> node) 
>
> Do you have suggestions for parameters  that would allow for an optimal 
> use of 
> this architecture? (a problem I have is the limited amount of memory 
> per core 1GB) 
>

you should compile with OpenMP support (which also means,
that you have to ditch the crappy default PGI programming environment
and use gcc/gfortran instead).

for plain quickstep calculations there should be a
significant performance increase when running with
4 MPI tasks per node and 8 threads per MPI task.

due to the massively reduced need for data
replication, you should not run out of memory.

since interlagos CPUs do some kind of improved
hyperthreading and two tasks will have to share
a floating point unit, i would expect that you should
get a large improvement already by just using
only half the cores when using plain MPI only.
make sure that you use the proper "distance"
flag to aprun, however, since you need to make
sure you have exactly one MPI task per 
"bulldozer" subunit.

i would talk to the user support specialists
of the machine, to make sure you get it right.

cheers,
    axel.
 

>
> Thanks in advance for your help 
>
> Carlo 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20120426/2c612782/attachment.htm>


More information about the CP2K-user mailing list