<div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>unlike with binding MPI tasks to "NUMA units",</div> <div>i didn't see a significant difference in performance.</div><div class="im"><div><br></div></div></blockquote><div><br>Yes, most part of time the improvements will be around 5-15%. It really depends on how OS manage processes/threads and how the application itself was developed. <br> <br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br><div>yes. updated, compiled and tested. it gives the output that i expect now.</div><div class="im"> <div></div></div></blockquote><div><br>good. Libnuma support should now work too. <br><br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im"> <div><br></div></div><div>We have both. Intel and AMD (which is forcing me to use compiler settings,</div><div>that are compatible with a common subset of both). overall the AMD ones</div><div>benefit the most from using processor and memory affinity, but i was surprised</div> <div>how much impact it has on the X5677 Intel CPUs (quad-core westmere ep</div><div>with 3.5GHz). just proves that there is always something new to learn...</div></blockquote><div><br>Yes, that is true. On the machines that I have worked, this was a rule too. Difficult to explain the reasons (cache size, cache.memory protocol??) :)<br> </div></div>cheers,<br>-- <br>[]'s<br>Christiane Pousa Ribeiro<br> <br>"Judge a man by his questions, rather than by his answers" <br>