<span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px; "><div style="text-align: left; ">Dear All,</div><div style="text-align: left; "><br></div><div style="text-align: left; ">
<font face="arial, sans-serif">I've started to use cp2k and noticed that job is slowed down on 2x8 procs in comparison with 2x2.</font></div><div style="text-align: left; "><font face="arial, sans-serif">We compared CPU timing per opt step for QM/MM task:</font></div>
<div style="text-align: left; "><font face="arial, sans-serif"><br></font></div><div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>
nodes x cores/node CPU time per iter, sec<br></div></span></span></div><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>2x2 115</div></span></span></div></blockquote><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>1x4 108</div></span></span></div></blockquote><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>4x2 70</div></span></span></div></blockquote><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>1x8 82</div></span></span></div></blockquote><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>2x8 4500</div></span></span></div></blockquote><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>4x8 4200</div><div><br></div></span></span></div></blockquote><font face="arial, sans-serif"><span style="border-collapse: collapse; ">Using of TRACE setting revealed that problem is mainly in <span style="font-size: 13px; border-collapse: separate; ">cp_fm_syevd_base subroutine which spent almost all of this additional CPU time in 2x8 and 4x8 cases.</span><br>
</span></font><blockquote style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div><br></div></span></span></div></blockquote><font face="arial, sans-serif"><span style="border-collapse: collapse; "><div>
<font face="arial, sans-serif"><span style="border-collapse: collapse; "><br></span></font></div><div><font face="arial, sans-serif"><span style="border-collapse: collapse; "><br></span></font></div>And we also compared CPU times of test </span></font>32H2O-md.inp for 2x2 and 2x8 cases and got:<div>
<br><div><div style="text-align: left; "><div><div style="border-collapse: separate; font-family: arial; font-size: small; text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>
nodes x cores/node CPU time per iter, sec<br></div></span></span></div><blockquote style="border-collapse: separate; font-family: arial; font-size: small; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>2x2 33</div></span></span></div></blockquote><blockquote style="border-collapse: separate; font-family: arial; font-size: small; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">
<div style="text-align: left; "><span style="font-family: arial, sans-serif; font-size: 13px; "><span style="border-collapse: collapse; "><div>2x8 172</div><div><br></div></span></span></div></blockquote><font face="arial, sans-serif"><span style="border-collapse: collapse; ">TRACE revealed that qs_forces, qs_energies_scf, scf_env_do_scf, velocity_verlet, qs_forces, qs_energies_scf, scf_env_do_scf and others are 5 times slower in 2x8 than 2x2</span></font></div>
</div><div><font face="fixed-width, monospace"><span style="font-family: arial, sans-serif; font-size: 13px; "><br></span></font></div><div><font face="fixed-width, monospace"><span style="font-family: arial, sans-serif; font-size: 13px; "><br>
</span></font></div><div><font face="arial, sans-serif">Do you have any suggections and ideas why it's happened?</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif"><br>
</font></div><div><font face="arial, sans-serif">CP2K version 2.2.262</font></div><div><font face="arial, sans-serif"><div dir="ltr" style="margin-bottom: 0.2em; text-align: left; font-size: 13px; ">the lib is MKL-Scalapack</div>
<div dir="ltr" style="margin-bottom: 0.2em; text-align: left; font-size: 13px; ">the system is cluster of XEON nodes (8cores/node) with Infiniband switch</div><div dir="ltr" style="margin-bottom: 0.2em; text-align: left; font-size: 13px; ">
the compiler is Intel's</div></font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">Thank you in advance.</font></div><div><font face="arial, sans-serif">Best regards, Maria.</font></div>
</div></div></span>