[CP2K-user] [CP2K:18664] Multinode jobs have poor scaling

Eric Patterson eric.v.patterson at gmail.com
Fri Apr 14 15:18:03 UTC 2023


Hi Nathan,

Are those the total processes/threads running across two nodes? If so, maybe your interconnect is not a good as it should be? If that’s what you’re requesting per node, then I would suggesting cutting the threads in half so you’re not hyperthreading.

I honestly have very little experience with MPI codes. CP2K is the first code I’ve used where MPI makes sense to use. I’m afraid I’m near the end of my useful comments… Perhaps someone else in the group or one of your sysadmin people can help you a bit more?

 - Eric


> On Apr 12, 2023, at 6:17 PM, Nathan Keilbart <nathankeilbart at gmail.com> wrote:
> 
> Hi Eric,
> 
> Thanks for the response. I went and tested this out as you suggested by your example. I have nodes with 56 processors each and I tested using two nodes comparing with time steps I'm getting for one node. I tested the following settings:
> 
> 1 MPI 112 Threads
> 2 MPI 56 Threads
> 4 MPI 28 Threads
> 8 MPI 14 Threads
> 16 MPI 7 Threads
> 
> I end up getting the best performance at and above 4 MPI processes and higher but this is still slower than simply using one node with 56 MPI processes. Thanks for the suggestion though. 
> 
> I'm not an experienced user with CP2K but just using an input file my colleague gave me to test out the installation that I'm helping out with. I would expect that a system of 100 water molecules and a single Au atom would still see some acceleration by using two nodes but I might be wrong. Any other thoughts?
> 
> On Tuesday, April 11, 2023 at 11:31:14 AM UTC-7 Eric Patterson wrote:
>> Hello Nathan,
>> 
>> What are your settings for MPI processes and OMP threads? On the machine I’m using (an older Intel machine with 48 physical cores per node and Omni-Path interconnect), I found good multi-node performance with 12 MPI processes per node and 4 OMP threads per process. Assigning all cores to MPI was nearly 3x slower and often resulted in memory issues. I did quite a bit of testing to come up with this configuration.
>> 
>> I imagine this could depend heavily on the type of job (mine are periodic cell optimizations and vibrational analysis, nothing fancy), so I definitely recommend doing some testing to see what works for you.
>> 
>> Cheers,
>> Eric
>> 
>> 
>> 
>>> On Apr 11, 2023, at 2:19 PM, Nathan Keilbart <nathank... at gmail.com <>> wrote:
>>> 
>> 
>>> Hello,
>>> 
>>> I recently finished compiling on an intel based hpc machine and appear to have a working binary. I initially tested a case with a single node job and got what appears to be relatively good times. Upon increasing the number of nodes to two, I actually saw in increase in time per step instead of less. The nodes are connected with infiniband I believe so there shouldn't be an issue with node to node communication. I'm wondering if I set some flag wrong when compiling and what I should look into to find out what's going on here. Let me know what kind of information I can provide. Thanks
>>> 
>>> Nathan 
>>> 
>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups "cp2k" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns... at googlegroups.com <>.
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/d5c11154-82e8-4c09-9713-9d43f9eb4b7bn%40googlegroups.com <https://groups.google.com/d/msgid/cp2k/d5c11154-82e8-4c09-9713-9d43f9eb4b7bn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com <mailto:cp2k+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/b9dde686-4391-46fe-9f46-4c0f55c30ab1n%40googlegroups.com <https://groups.google.com/d/msgid/cp2k/b9dde686-4391-46fe-9f46-4c0f55c30ab1n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/6AC4C900-CCC0-4E17-AAB8-48B92CAE548F%40gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20230414/716e78d0/attachment-0001.htm>


More information about the CP2K-user mailing list