[CP2K-user] [CP2K:17931] Re: Runtime error
Luis Cebamanos
luiceur at gmail.com
Mon Oct 24 14:38:29 UTC 2022
Thanks Matt,
Apologies for my late reply it somehow went to the Spam folder.
So, I think it is a building problem but I cannot figure out why or what
can be causing this. What are the responsible modules/libraries kicking
off right after " CELL_REF| Numerically orthorhombic:
YES" ?
Regards
On 20/10/2022 18:03, Matt Watkins wrote:
> I can't directly comment but multiple GPUs per MPI task is something
> to set up gradually.
> If you can run a gpu on an MPI process, mulitple MPI process per GPU
> etc then explain where things break.
> Matt
>
> On Wednesday, 19 October 2022 at 16:34:18 UTC+1 LC wrote:
>
> It seems that it breaks at Quickstep, could it be a bad
> installation? I did not see any error but I could miss a required
> library?
>
>
> On Wednesday, 19 October 2022 at 14:02:38 UTC+1 LC wrote:
>
> Hello,
>
> Running the H2O-DFT-LS benchmark on 4 GPUs ( 4 MPI tasks per
> GPU), I get:
>
> CELL_REF| Volume [angstrom^3]: 25825.145
> CELL_REF| Vector a [angstrom 29.558 0.000 0.000 |a|
> = 29.558
> CELL_REF| Vector b [angstrom 0.000 29.558 0.000 |b|
> = 29.558
> CELL_REF| Vector c [angstrom 0.000 0.000 29.558
> |c| = 29.558
> CELL_REF| Angle (b,c), alpha [degree]: 90.000
> CELL_REF| Angle (a,c), beta [degree]: 90.000
> CELL_REF| Angle (a,b), gamma [degree]: 90.000
> CELL_REF| Numerically orthorhombic: YES
> [node:117708:0:117708] Caught signal 11 (Segmentation fault:
> address not mapped to object at address 0x440000
>
> ==== backtrace (tid: 117716) ====
> 0 0x0000000000012b20 __funlockfile() :0
> 1 0x000000000006d680 PMPI_Comm_set_name() ???:0
> 2 0x000000000006d680 PMPI_Comm_size()
> /build-result/src/hpcx-v2.12-gcc-MLNX_OFED_LINUX-5-redhat8-cuda11-gdr
> copy2-nccl2.12-x86_64/ompi-1c67bf1c6a156f1ae693f86a38f9d859e99eeb1f/ompi/mpi/c/profile/pcomm_size.c:63
> 3 0x0000000000029e29 MKLMPI_Comm_size() ???:0
> 4 0x0000000000027ff1 mkl_blacs_init() ???:0
> 5 0x0000000000027f38 Cblacs_pinfo() ???:0
> 6 0x000000000001881f blacs_gridmap_() ???:0
> 7 0x00000000000181fe blacs_gridinit_() ???:0
> 8 0x00000000024c422c __cp_blacs_env_MOD_cp_blacs_env_create()
> ???:0
> 9 0x0000000000bc22b5 __qs_environment_MOD_qs_init() ???:0
> 10 0x0000000000c6dded __f77_interface_MOD_create_force_env()
> ???:0
> 11 0x00000000004478e4 __cp2k_runs_MOD_cp2k_run() cp2k_runs.F90:0
> 12 0x000000000044a212 __cp2k_runs_MOD_run_input() ???:0
> 13 0x000000000043dac1 MAIN__() cp2k.F90:0
> 14 0x000000000040ec6d main() ???:0
> 15 0x0000000000023493 __libc_start_main() ???:0
> 16 0x000000000043ccde _start() ???:0
>
> Any idea why?
> L
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/cp2k/_m92YHtkeGY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> cp2k+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/0538a35d-48b7-435d-b84b-e1207abcdf46n%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/0538a35d-48b7-435d-b84b-e1207abcdf46n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/b7eaad10-afef-ac26-dc8f-5b2029f42b7e%40gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20221024/754d9bf6/attachment.htm>
More information about the CP2K-user
mailing list