[CP2K:873] performance on Linux Opteron cluster

Teodoro Laino teodor... at gmail.com
Mon Mar 24 20:17:14 UTC 2008

Dear Yunfeng,

I guess you're definitely not aware of what you were running.
That input file is not there to do performance analysis (there's a  
benchmark directory instead!)
In few months, when you will have a more solid cp2k background,  you  
will open again that input file and you will realize that
the file 32H2O-md.inp is highly IO bounded.
RESTART of the wavefunctions are written at every SCF step and  
restart files at every MD step and since the IO (at least in cp2k)
are non-parallel operations your code will never scale..

I just modified for you the input file and I have run from 1 to 32  
procs and these are the data:

1_proc/32H2O-md.out:    - 3294.21
2_proc/32H2O-md.out:    - 1898.79  (1.73)
4_proc/32H2O-md.out:    - 1038.44  (1.82)
8_proc/32H2O-md.out:    -   572.58  (1.81)
16_proc/32H2O-md.out:  -   334.94  (1.71)
32_proc/32H2O-md.out:  -   242.30  (1.38)

Timings are in seconds and in parentheses the speed w.r.t the same  
job with half number of processor (ideally 2).
The writing of  the wfn restart file can easily take 1 second each  
time.. and in that run it is going to be written ~500 times.
This justifies your timings!!!

Keep in mind that these jobs have run on a dual core machine (like  
yours).. it is possible to have ration closer to 2 (?!?) when using
single core processors.

I can understand the psychology behind the hurries of  testing how  
well a code is scaling.. but...
to do that you need to know what you are doing.. and know it very  
well.. even with very user friendly codes!
In attachment all the data, for you and for all the people that in  
the future want something to compare..


-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.tgz
Type: application/octet-stream
Size: 617272 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20080324/41b456e4/attachment.obj>
-------------- next part --------------

On 24 Mar 2008, at 18:44, LIANG Yunfeng wrote:

> Dear All,
> The analyst help me set up cp2k (with the PGI compiler) in Linux  
> Opteron cluster with Infiniband interconnect, where each node has 2  
> Opteron CPUs (2.40GHz) and 2 GB of memory.
> He gives the reports as below:
> Running on 2 processors, this completed the 32H2O-md.inp
> problem in about 32 minutes....
> I also ran it on 4 processors, which took a little over 20 minutes.
> I then ran it on 8 processors and found it took just under 20 minutes.
> So, the program is not scaling very well as the number of processors
> is increased.
> ....
> I think cp2k is still heavily bounded with OpenMP, so the  
> performance can only be like the above, am I right?
> Thank you
> Sincerely, Yunfeng
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google  
> Groups "cp2k" group.
> To post to this group, send email to cp... at googlegroups.com
> To unsubscribe from this group, send email to cp2k- 
> unsub... at googlegroups.com
> For more options, visit this group at http://groups.google.com/ 
> group/cp2k?hl=en
> -~----------~----~----~----~------~----~------~--~---

More information about the CP2K-user mailing list