I can't directly comment but multiple GPUs per MPI task is something to set up gradually.<div>If you can run a gpu on an MPI process, mulitple MPI process per GPU etc then explain where things break.</div><div>Matt<br><br></div><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Wednesday, 19 October 2022 at 16:34:18 UTC+1 LC wrote:<br/></div><blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div>It seems that it breaks at Quickstep, could it be a bad installation? I did not see any error but I could miss a required library?</div><div><br></div><br><div class="gmail_quote"><div dir="auto" class="gmail_attr">On Wednesday, 19 October 2022 at 14:02:38 UTC+1 LC wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Hello,</div><div><br></div><div>Running the H2O-DFT-LS benchmark on 4 GPUs ( 4 MPI tasks per GPU), I get:<br></div><div><br></div><div> CELL_REF| Volume [angstrom^3]: 25825.145<br> CELL_REF| Vector a [angstrom 29.558 0.000 0.000 |a| = 29.558<br> CELL_REF| Vector b [angstrom 0.000 29.558 0.000 |b| = 29.558<br> CELL_REF| Vector c [angstrom 0.000 0.000 29.558 |c| = 29.558<br> CELL_REF| Angle (b,c), alpha [degree]: 90.000<br> CELL_REF| Angle (a,c), beta [degree]: 90.000<br> CELL_REF| Angle (a,b), gamma [degree]: 90.000<br> CELL_REF| Numerically orthorhombic: YES<br>[node:117708:0:117708] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x440000</div><div><br></div><div>==== backtrace (tid: 117716) ====<br> 0 0x0000000000012b20 __funlockfile() :0<br> 1 0x000000000006d680 PMPI_Comm_set_name() ???:0<br> 2 0x000000000006d680 PMPI_Comm_size() /build-result/src/hpcx-v2.12-gcc-MLNX_OFED_LINUX-5-redhat8-cuda11-gdr<br>copy2-nccl2.12-x86_64/ompi-1c67bf1c6a156f1ae693f86a38f9d859e99eeb1f/ompi/mpi/c/profile/pcomm_size.c:63<br> 3 0x0000000000029e29 MKLMPI_Comm_size() ???:0<br> 4 0x0000000000027ff1 mkl_blacs_init() ???:0<br> 5 0x0000000000027f38 Cblacs_pinfo() ???:0<br> 6 0x000000000001881f blacs_gridmap_() ???:0<br> 7 0x00000000000181fe blacs_gridinit_() ???:0<br> 8 0x00000000024c422c __cp_blacs_env_MOD_cp_blacs_env_create() ???:0<br> 9 0x0000000000bc22b5 __qs_environment_MOD_qs_init() ???:0<br>10 0x0000000000c6dded __f77_interface_MOD_create_force_env() ???:0<br>11 0x00000000004478e4 __cp2k_runs_MOD_cp2k_run() cp2k_runs.F90:0<br>12 0x000000000044a212 __cp2k_runs_MOD_run_input() ???:0<br>13 0x000000000043dac1 MAIN__() cp2k.F90:0<br>14 0x000000000040ec6d main() ???:0<br>15 0x0000000000023493 __libc_start_main() ???:0<br>16 0x000000000043ccde _start() ???:0<br></div><div><br></div><div>Any idea why?</div><div>L<br></div></blockquote></div></blockquote></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups "cp2k" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:cp2k+unsubscribe@googlegroups.com">cp2k+unsubscribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/cp2k/0538a35d-48b7-435d-b84b-e1207abcdf46n%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/cp2k/0538a35d-48b7-435d-b84b-e1207abcdf46n%40googlegroups.com</a>.<br />